The Common Business Oriented Language (COBOL) has been widely used in business computing since the 1960's. The advantages of COBOL include its maintainability and its portability across hardware platforms and operating systems. However, there is no adequate data processing system available that can flexibly process COBOL data files generated by COBOL applications sold by the many different COBOL application vendors. There are two major difficulties that have hindered the development of such a data processing system. First, the format of a COBOL data file is in part defined by the COBOL application that generated the COBOL data file. Conventional data processing systems typically need to be modified to handle data files generated by each new COBOL application based on knowledge of the COBOL data file format, which may require an understanding of the source code of the COBOL application. Second, even with an understanding of the COBOL application source code, additional understanding of the physical environment that generated the COBOL data file may be needed to read the file. For example, this understanding includes characteristics of the system generating the COBOL data files such as endianness, character encodings, and localizations. This adds further complications to the modification of conventional data processing systems to read COBOL data files.
There are other challenges in the processing of COBOL data files. COBOL data files sometimes contain interspersed data of different types, such as employee data and customer data, where each type of data is defined by a distinct COBOL data record schema within COBOL application source code. Conventional data processing systems typically cannot process such data files, since those systems assume that all data within such data files is of the same type. COBOL data files also sometimes contain nested data. Conventional data processing systems often are not able to process nested data, and also do not place data read from such data files in a flat data structure that enables handling by modern database management systems.
To address these shortcomings, it would be desirable to provide a COBOL data processing system that handles COBOL data files created by multiple vendors. It would also be desirable for the COBOL data processing system to handle COBOL data files generated by unfamiliar COBOL applications created by unfamiliar vendors based on user input, without requiring modification of the COBOL data processing system itself. This solution may enable users unfamiliar with COBOL to process COBOL data files created by multiple vendors. It would also be desirable for this solution to be capable of extracting subsets of data from COBOL data files with interspersed data of different types based on definitions of these data types in multiple COBOL data record schemas. Finally, it would be desirable for this solution to read nested data in COBOL data files, and to store this nested data in a flattened form that enables handling by database management systems, such as in accordance with a nested COBOL data record schema corresponding to this nested data.