There exist huge unstructured or badly structured programs, which have been modified over many years and which continue to need maintenance. In many cases long-lived programs are represented by only a running code, with no other reliable external documentation. In order to make these programs (a) easier to read, and (b) easier to analyze impact of changes on a program, it was proposed in our U.S. Pat. No. 4,833,641 to reverse-engineer numerical documentation from a source code without changing a single line of the code. In the first stage of reverse engineering, the program is dissected into formal logical parts. Each part starts with a line to which control over data-processing is transferred from any other logical part in the program, said part ending at a line preceding the entrance of a subsequent logical part, or at a RETURN command if the logical part is a routine. In the second stage of reverse engineering each transfer of control is described by two addresses: (1) an address of an exit of one logical part, and (2) an address of an entrance of a linked logical part. The entrance address is defined by at least three attributes: 1) by an entrance number, which is always 0 (zero), because each part has one and only one entrance, 2) by a part number, and 3) by a label (number) of entrance line. The exit address can be defined only by an exit number and by a part number. A label (number) of exit line is optional if all exits are consecutively numbered from 1 and up starting with the closest exit to a part entrance. Routine logical parts were suggested to number from 500 and up, and non-routine logical parts--from 100 up to 499. This description of a link by at least five attributes is complemented in the third stage of reverse engineering by a semantic description of functioning of each part with the help of part's names. A process of naming transforms a formal logical part into a functional part with or without changing the formal part. In some cases several formal parts, which perform the same function, are integrated into one functional part. In other cases one formal part may be dissected into several functional parts, each of them performing an individual function. Though dissecting a program into formal logical parts can be done by a computer with no human involvement, declaration of functional names requires involvement of a person who is familiar with the program. If a program is written in COBOL language, or if a program is amply supplied with remark statements, then functional parts may be declared without human involvement. A combination of numerical and semantic description of two linked parts creates an informational word. In a collection of informational words each word is consecutively numbered. This collection of informational words (also called a "genetic collection" of links) is used in the forth stage of reverse engineering for creating an immediate environment of each functional part. All links of the genetic collection are searched for those words, which contain a particular part, then these words are printed out being arranged in a consecutive order of entrance/exit numbers. A source code can now be substituted by numbered portions of the code identified as functional logical parts, each part being located on a separate page and complemented by a description with the help of informational words of an immediate environment of this part. A collection of these parts arranged in a consecutive order of part numbers belongs to documentation "in-the-small", called as such because it is directly tied up with the source code. A list of entrances of all functional parts arranged in a consecutive order of entrance labels also belongs to documentation "in-the-small", as well as a genetic collection of links.
Documentation "in-the-small" is powerful enough for conducting corrective program maintenance, which we define as one that either causes no alterations in a program logic, or these alterations are not significant. A change in any logical part can easily be documented supplying all comments about this particular change on a page, where the part is located. Impact of the change can readily be evaluated analyzing immediate environment of the changed part.
A process of creating documentation "in-the-small" can not be realized only by a human without computer involvement. A human has a limited ability to remember which does not commensurate with a huge amount of information that a human receives during his life-time. In order to protect memory from saturation Nature supplied a human with the ability to get tired and to forget. However, fatigue and forgetfulness cause human mistakes which would flood numerical documentation "in-the-small" if the documentation were created by a human without the help of a computer. Suppose we deal with a program, which has just 200 transfers of control, each one being described by an information word, which has 30 characters, or 6000 characters in total. If we conservatively assume 1% as a level of human erroneousness, we will arrive at 60 mistakes while a collection of all links is created, then another 60 multiplied by n mistakes while immediate environment for each logical part is created (where n shows how many times on average each informational word is repeated in all immediate environments). At a stage of creating documentation a human search for these mistakes is frustratingly ineffective--it is like looking for a needle in a hay stack. Computer memorizes a genetic collection of informational words only. All other collections, like the immediate environment of each part, are derived by a computerized search of this genetic collection. The derived collections make human mistakes conspicuous: for instance, an error in a part number of a particular exit address (say, part 101, exit 1 instead of correct--part 100, exit 1) will be found in the immediate environments of both parts (exit 1 of part 100 will be missing, whereas part 101 will be having two exits 1). After all mistakes conspicuously exhibited by a computer in the documentation "in-the-small" are corrected, the genetic collection will become error-free.
Documentation "in-the-small" is not effective for conducting perfective maintenance, which includes program enhancement or program reusing in order to meet new requirements of a user, and which ordinarily causes substantial alterations in a program logic. A global overview of a program by considering a succession of logical parts, rather than each individual part and its immediate environment, is needed in order to conduct this type of maintenance. These successions of functional parts create program circuits, which constitute documentation "in-the-large".
We address in this invention the problem of developing documentation "in-the-large" from documentation "in-the-small".