Programmers very often add new functions/methods to their source code by copying relevant code that almost suits their needs, then replacing parts of it to fit the finest objectives of the new code at hand. Especially, patterns that are suitable for an element in a collection (such as an attribute or a method in a class, or a class in a package or a class) can be propagated by copying the code for an existing element, then replacing the name of the existing element by the name of the to-be-created element.
For example, considering that the following class that handles data elements has characteristics deemed desirable for a to-be-created concept segment, a programmer would typically copy the DataElement.java file contents to a Segment.java file, then edit the Segment.java file to replace all references to data elements by references to segments.
public class DataElement {
public DataElement( ) {                . . . .        
}
private String DEFAULT_DATA_ELEMENT_NAME=“dummyName”;
private String dataElementName=DEFAULT_DATA_ELEMENT_NAME;
/** Return the data element name. */
public String getDataElementName( ) {                return dataElementName;        
}
}
For a very small example like this one, the programmer would have a few edits to do in place to replace the left column items with their right column counterparts:
data element segment
DATA_ELEMENT SEGMENT
dataElement segment
DataElement Segment
When the length of the source gets bigger, the programmer would then switch to the use of a find and replace facility available in the editor (be the editor using a windowing system or not, find and replace commands are most often available). The problem then is that for a single conceptual replacement (‘logically replace data element with segment’), the programmer ends up launching four find and replace commands.
Classical solutions to this problem include:
the use of the match case option to discriminate between different forms of the concept; while this is useful, this only partitions the problem, leaving the final say to the programmer on a case per case basis.
in the case of not using the match case option, some replacement tools (such as Microsoft® Word 2000) are able to infer simple case matching; for example, asking to replace ‘data element’ with ‘segment’ would yield the proposal to replace ‘Data element’ with ‘segment’, then effectively replace it with ‘Segment’; it is noted that the same tool, asked to replace ‘data element’ with ‘segment element’ fails to replace ‘data Element’ with ‘segment Element’ but generates ‘segment element’ instead.
sophisticated regular expressions matching; using VI, Emacs or other powerful editors, skilled programmers can implement versatile replacement needs; this calls for sophisticated replacement expressions, hence is an error prone way of doing it though, and still does not solve the hardest cases.
The common limitation of those solutions is that they involve using the scenario multiple times to get the final desired outcome, with the programmer explicitly specifying each variation of the concept. Only VI or Emacs could tackle cases like DATA_ELEMENT, but it lacks facilities for other cases.
If we enlarge the problem considered above to the more general question of getting regular programming patterns applied to a few parameters that govern differences between their instances, numerous ways to do that exist, starting from templates that can be close to the final code aspect (for example some source tools provide templates and are able to generate parts of the code), or from more conceptual notations (such as UML). However, they do not follow the same scenario. In case the needed pattern exists in the tool or in a library developed over time by the programmer, they are perfectly fine. In case the programmer is exploring new avenues and refining emerging templates, the copy/paste/find and replace scenario is much more natural.
Another way of doing the find and replace phase of the scenario is to involve refactoring tools that are aware of the considered programming language semantics. For example, copy the DataElement class into another package, then ask a refactoring tool to rename it to Segment. While this would appropriately (and more accurately than any text based find and replace tool) change the class name, its constructor name, and even references to the class name in comments if asked so, it would fail though to match derived forms like dataElement. Moreover, acting at the semantic level is not always desirable or even possible. For example, if the new class is to be created into the same package than the existing one, there is no way to get it to compile right before performing part of the renaming process.
The editor Microsoft® Word 2000 for instance, provides an adaptive management of initial case in replacement. This editor provides a partial support for upper case when it is in first position of the word to be found. When performing a Find and Replace on regular expressions, it is difficult to specify the appropriate replacement upper case letter when it is not present in the searched string; this find and replace applied to find ‘data element’ and replace by ‘segment element’ would work for Data element, but not for data Element.
The U.S. Pat. No. 5,873,660 of Microsoft Corporation having as title ‘Morphological search and replace applying to natural language’ teaches to find inflected forms of a word by retrieving sets of word forms having a same root word. This implies that the find and replacement words match parts of speech. This solution for finding the inflected forms of input find and replace words is typical for text written in a natural language because the inflected forms can be derived from input find and replace words by applying the grammatical rules of the language. However, the inflected forms of input find and replace words of a text which is for instance a programming language cannot be found with this solution because the inflected forms are different lexico-syntactic forms which are not related to input find and replace words by known grammar rules one can refer to.
Furthermore, the solution of prior art for natural languages such as with the U.S. Pat. No. 5,873,660, are based on dictionaries. The dictionaries store all the inflected word forms for one given word. The size of the database storing the dictionary and the cost for maintaining this database becomes very unrealistic over time. The size and the control of the evolution of the database becomes even more unrealistic when instead of words, there is a need, as in the case of programming languages, for having a find and replace function applying not only to words but to expressions. In this case the dictionary should contain inflected forms for each expression that is for each possible word combination.
There is thus a need for a find and replace editing function performed in one pass and applying to a text written in a programming language or for other type of texts wherein the inflected forms of the input find or replace words or expressions are words or expressions which do not follow semantic or grammatical rules of a known natural language. With these languages, the inflected forms of input find or replace words or expressions may be different authorized lexico-syntactic codification for a same concept.