Hierarchical databases, such as IBM's IMS (Information Management System), are well known in the art. (IMS is a trademark of International Business Machines Corporation in the United States, other countries, or both.) IMS is a hierarchical database management system (HDBMS) with wide spread usage in many large enterprises where high transaction volume, reliability, availability and scalability are of the utmost importance. IMS provides software and interfaces for running the businesses of many of the world's largest corporations. However, companies incorporating LMS databases into their business models typically make significant investments in IMS application programs in order to have IMS perform meaningful data processing particularly tailored to the needs of their respective enterprises. IMS application programs are typically coded in COBOL, PL/I, C, PASCAL, Assembly Language, or Java and are created by programmers with critical skill sets in a programming environment that may be time consuming, inefficient and error prone. (Java is a trademark of Sun Microsystems, Inc. in the United States and/or other countries.)
Physical IMS databases are hierarchic. Each database has a schema defined as a hierarchy or tree of segment types, each of which is defined, in turn, as a collection of fields. This definition of a physical database schema is contained in an IMS control block called a “Database Description” (DBD). A physical IMS database is a simple hierarchy, but multiple physical databases (i.e., hierarchies), may be linked by one or more associations called “logical relationships” which allow new “logical hierarchies” to he defined. A logical hierarchy typically traverses multiple physical hierarchies by crossing one or more logical relationships, and incorporates segments from several databases. Logical hierarchies are defined in “Logical Database Descriptions” (Logical DBDs), and may he processed, for the most part, as if they were simple physical databases. They are somewhat analogous to relational database “views” that are defined on joins of a number of database tables. In addition, “secondary indexes” may be defined for a database, which provide alternate search paths to any segment type in the database hierarchy (logical or physical), and affect the application's view of its data.
Each IMS application program is defined to process one or more physical DBDs or logical DBDs. This definition is contained in another IMS control block called a Program Specification Block (PSB). For each DBD that the program processes, the PSB specifies the subset of the DBD hierarchy that the application is authorized to process, and optionally its authorized level of processing (e.g., Get, Replace, Insert, Delete) for each segment in the subset. This information for each DBD is contained is a structure called a “Program Control Block” (PCB) within the PSB. If the application processes more than one database hierarchy (logical or physical) there will be multiple PCBs in its PSB.
To write an application program, the application developer must understand the application's view of its databases. Access to a database from an IMS application program is performed by calling the IMS call interface and specifying which PCB (i.e., which hierarchy) the call is intended to operate on. The IMS interface defines a number of operations to search and navigate through a hierarchy, and to update, insert and delete segments. The call also specifies the target segment or segments and search arguments that specify positioning in the hierarchy. Search arguments typically contain field name/value pairs or the target segments.
To code the database calls in the application program, the developer needs to know:                The names of the database segments in the hierarchy        The hierarchic relationship of the segments to each other        The fields in each segment, their positions and lengths        Which fields are search, or indexed, fields        The data types of the fields        If the application processes multiple hierarchies (i.e., multiple data PCBs in the application's PSB), then this information is repeated for each PCB.        
Traditionally, when coding an application, the developer gets this information by referring to the source copies of the PSB and DBDs, which are in the form of Assembler Language macros. The PSB source contains macros for each PCB, which names the database (logical or physical). Each PCB contains macros for the segments in that PCB which specify the hierarchic arrangement of the segments, but additional details of each segment and its fields must be obtained from the DBD source. The application developer locates the corresponding source file for that DBD from the DBD name in the PCB The segment macros in the PCB also name then corresponding segment in the DBD's hierarchy, so the developer can locate segment definitions in the DBD. Segment definitions in physical DBD hierarchies contain macros describing at least some of the fields in the segment, with their lengths and offsets. A parameter on the field macro gives an indication of the data type of the field.
If the PCB refers to a logical hierarchy there is another level of indirection. The segment macros in the logical DBDs d(o not contain information on length, offset, and type, but rather refer by name to segment macros in one or more physical DBDs. Thus the developer must follow the name links to the physical DBDs to obtain the needed information.
Another complication arises in that DBDs generally do not contain information about all the fields in a segment. Typically, field macros are only included in physical DBDs for fields that can be used as “search fields” when accessing the database. These are fields that may be referenced in “segment search arguments” of database calls. If the application needs to process other fields in the segments (as it generally will need to do) the developer must get the information from some other source. Often, layouts of fields within segments can be captured from language structures of existing applications, such as COBOL copybooks, and can be included into new application programs.
The net result of all this is that in order to get a complete picture of the data, application developers must refer to and merge information from several sources: the PSB, possibly one or more logical DBDs, physical DBDs, and existing language source for the segments being processed. This process is skill intensive, complex and error prone, especially for large databases with logical relationships.
IMS applications written in the Java language involve even more complexities. IMS Java applications access IMS databases using a limited subset of the SQL92 query language and JDBC (Java Database Connectivity), the standard Java APIs for accessing relational databases. This contrasts with applications written in other languages, which must use the IMS defined call interface. When coding an IMS Java application, the application developer needs all the same information listed above for developers in other languages. In addition, however IMS Java allows an application to refer to PSBs, PCBs, Segments, and Fields, using Java-style identifiers rather than the 8-character names used by the PSB, PCB, Segment and Field, macros. The developer must know these Java alias names for each entity. IMS Java presents data to the application using the broad range of standard JDBC data types, and to process a field the developer must also know its JDBC data type.
Neither the Java-style aliases nor the JDBC data type are present in the PSB or DBD. For its internal operation IMS Java requires a “metadata class” to be created by the Java programmer which summarizes all of the information about database hierarchies, segments and fields normally found in the PSB, DBDs, as well as the Java alias names and data types, and details of additional fields (not defined in the DBD).
A developer of an IMS Java application could in theory use this metadata class (or its java source file) as a comprehensive reference source for understanding the data view of the application. However, this metadata class is optimized in its organization for consumption by the IMS Java system code and, accordingly, is greatly lacking in its suitability for use by a human developer.
Accordingly, there is a great need for an automated and integrative approach to collecting pertinent information from disparate sources and presenting the information to an application programmer in a form suitable for humans and conducive to efficient development of application source code for accessing hierarchical databases. Furthermore, this information should be comprehensive to the extent that it obviates the need to consult any other database source materials for information required to build the hierarchical database access code.