Today computer applications are written by many developers on their own workstations generally located at different locations using an integrated development environment (IDE) connected through a network, such as a local area network, a wide area network, even the Internet. Usually, the IDE keeps tracks of the versions of the code being developed and the editing that may have taken place or is occurring to that code. Sometimes, however, when a particular line or lines of code have been changed because of an error or just to make the code more efficient, the change may not be updated in the IDE until the user checks in the changes later. If, however, another developer is using similar code or code that was derived from the first code, she/he may not even be aware of a bug or an increase in efficiency by a change. The IDE may not be aware of the existence of the related or similar code so it could not inform the second programmer of any changes.
Another common phenomenon of software development is the use of software components that are lines of code that execute fairly often for same purpose. For instance, there may be a calendar component that could be used across multiple applications, like scheduling, payroll; there may be a component to print a document in a certain format; there may be a component to create a window; or another to create a hash table. The use of components has increased significantly and is especially pronounced in object-oriented programming (OOP) and other highly modular languages where a single general purpose portion of a computer program may be executed in a number of different situations for different purposes.
With an object-oriented programming language, for example, a program is constructed from a number of “objects,” each of which includes data and/or one or more sets of instructions that define specific operations to be performed on the data. A few or a large number of components may be used to create an object, and a large number of objects may be used to build a computer program with each object interacting with other objects in the computer program to perform desired operations. When one object invokes a particular routine in another object, the former object is often said to be calling the routine in the latter object. Some general purpose objects in a computer program may support basic operations, e.g., displaying information to a user, printing information on a printer, storing or retrieving information from a database, etc. Particularly, these generic type of objects are called by many different objects so that a change in the code in one of the objects may benefit other similar or related objects, either from which the edited object or component was derived, or from other objects or components which were derived from the edited object.
Despite the fact that components are intended to be reused, programmers often fail to take advantage of these packaging techniques and, instead, copy code from one component to create a similar one. Finding all of the components or objects which are derived in this way can be incredibly difficult, if not impossible. Examining hundreds or thousands of lines of program instructions is tedious and time consuming, and sometimes variable names change or strings change, so finding related or derivative sections of code is not easily accomplished. Manual review of the code, thus, is fraught with the possibilities that not all related or derived components or objects will be located.
A computer application, including objects, typically has hundreds or thousands of these components, which in turn may be grouped into smaller pieces of source code called program structures or constructs, as is known in the field. A conditional construct specifies several different execution sequences, for example, a CASE statement, an IF statement, a conditional expression in ALGOL. An executable construct specifies one or more actions to be taken by a computer program at execution time and comprise executable statements. A loop construct specifies an iteration in the execution sequence, for example, DO loops in FORTRAN, FOR loops in ALGOL, PERFORM loops in COBOL, DO WHILE loops in PL/I. There are other constructs that exist and still others continually arising as new languages arise, especially as Internet-based applications become plentiful.
Parsing is a very important part of computer programming languages because constructs are comprised of statements, which in turn, are parsed into tokens by a compiler. In any language, including computer languages, parsing means to divide a phrase of the language into small components that can be analyzed. For example, parsing this sentence would involve dividing it into words and phrases and identifying the noun, verb, adjective, direct objects, indirect objects, noun. Computer compilers parse source code written by a developer and translate the source code into object code readable by the machine. Similarly, any applications that processes complex commands and virtually all end-user applications must be able to parse the commands. Parsing is often divided into lexical analysis and semantic parsing. Lexical analysis concentrates on dividing strings into components, called tokens, based on punctuation and other keys, such as in the spoken language, identifying the nouns, verbs, commas, etc. A token may be thought of as the smallest independent unit of meaning within a program as defined by either a parser or the lexical analyzer. Semantic parsing attempts to determine the meaning of the string, for instance, identification of the subject of the sentence, the tense of the verbs, the direct and indirect objects, etc.
There is a need in the industry to help programmers locate related or derived components and objects when editing a particular component or code. There is a further need in the industry to determine if the changes made to one component or object should be applied to the other related or derived/derivative components or objects.