A computer system can be generally divided into four components: the hardware, the operating system, the application programs and the users. The hardware (e.g., central processing unit (CPU), memory and input/output (I/O) devices) provides the basic computing resources. The application programs (e.g., database systems, games, business programs (database systems, etc.) define the ways in which these resources are used to solve computing problems. The operating system controls and coordinates the use of the hardware resources among the various application programs for the various users. In doing so, one goal of the operating system is to make the computer system convenient to use. A secondary goal is to use the hardware in an efficient manner.
The Unix operating system is one example of an operating system that is currently used by many enterprise computer systems. Unix was designed to be a time-sharing system, with a hierarchical file system, which supported multiple processes. A process is the execution of a program and consists of a pattern of bytes that the CPU interprets as machine instructions (text), data and stack. A stack defines a set of hardware registers or a reserved amount of main memory that is used for arithmetic calculations.
The Unix operating system consists of two separable parts: the “kernel” and the “system programs.” Systems programs consist of system libraries, compilers, interpreters, shells and other such programs that provide useful functions to the user. The kernel is the central controlling program that provides basic system facilities. The Unix kernel creates and manages processes, provides functions to access file-systems, and supplies communications facilities.
The Unix kernel is the only part of Unix that a user cannot replace. The kernel also provides the file system, CPU scheduling, memory management and other operating-system functions by responding to “system-calls.” Conceptually, the kernel is situated between the hardware and the users. System calls are used by the programmer to communicate with the kernel to extract computer resource information. The robustness of the Unix kernel allows system hardware and software to be dynamically configured to the operating system while applications programs are actively functional without having to shut-down the underlying computer system.
The kernel and system programs consist of a machine-code representation of programs developed in a higher-level language, such as C, C++, or assembly language. These programs make use of data structures—a structured grouping of program data, the order and composition of which are essential to understanding the data structures. This ordering and composition information is used during a compilation process—the translation of programs from high-level languages to machine code. During the compilation of optimized code, such as that used for the Unix kernel and system programs, the ability to easily extract this information from the resulting program objects or modules objects is lost.
In order to analyze the data structures used by the kernel and system programs, separate and special-purpose extraction and analysis programs must be written. These programs contain the knowledge necessary to reconstruct the data structures used by the kernel. As these programs have specific knowledge of the data structures involved, they must be updated whenever the data structures change, and must be delivered separately from the kernel modules and system libraries.
FIG. 1 is a block diagram illustration of an exemplary prior art computer system 100. The computer system 100 is connected to an external storage device 180 and to an external drive device 120 through which computer programs can be loaded into computer system 100. The external storage device 180 and external drive 120 are connected to the computer system 100 through respective bus lines. The computer system 100 further includes main memory 130 and processor 110. The drive 120 can be a computer program product reader such a floppy disk drive, an optical scanner, a CD-ROM device, etc.
FIG. 1 additionally shows memory 130 including a kernel level memory 140. Memory 130 can be virtual memory which is mapped onto physical memory including RAM or a hard drive, for example. A programmer designs data structures used by the kernel 140. When the kernel 140 is compiled, the data structures are irreversibly encoded in the programs stored in the kernel level memory 140. User applications 160A and 160B are coupled to the computer system 100 to utilize the kernel memory 140 and other system resources in the computer system 100. Irreversibly-encoded data structures are also present in user applications 160A and 160B. As the composition of these data structures cannot be automatically extracted from the kernel level memory 140 and/or user applications 160A and 160B, specialized programs must be written to perform this task manually.
These manual extraction operations, which are essential for debugging and program analysis, are closely tied to the composition and ordering of the data structures, and thus must be manually maintained separately from the kernel and user applications they reference. These extractions operations must be updated whenever the data structures they extract change.
Prior art attempts to encode data structure information in kernel modules and user applications use either stabs or DWARF encodings. These encodings are wasteful, as they include massive duplication between program source files. To include these encodings in the kernel level memory 140 would result in performance degradation to the underlying computer system.