1. Technical Field
The invention disclosed broadly relates to computer methods and more particular relates to a computer method to interpret rules to transform an input string into an output string.
2. Background Art
A "rewrite mechanism" is a process that applies transformations or rewrite rules to an input string to transform it into a new string. Although the idea seems very simple on the surface, there are complications which make it necessary to define resolution of ambiguities and guarantee termination. These problems may be appreciated through some examples. Suppose that we have the rewrite rules
1) a.fwdarw.x PA1 2) b.fwdarw.y. PA1 3) ab.fwdarw.z, PA1 3) yy.fwdarw.z, PA1 3) xy.fwdarw.ab.
The rules are interpreted to mean that every time the character on the left occurs in a string which is being transformed, it is replaced by the character on the right side of the arrow. These rules will convert the string "abba" into "xyyx" by simple substitution. However, if we add an additional rule which replaces "ab" with "z,"
then we have the following ambiguity: if rule 1 applies before rule 3, "abba" gets transformed into "xyyx," but if rule 3 has a higher priority than rule 1, then "abba" becomes "zyx."
An additional problem that is encountered in rewrite rule system is determining whether a set of transformation rules will terminate when executed. In other words, are the rules defined in such a way that the process will terminate? The next example illustrates the problem. If we define a rule to transform "yy" into "z"
the string "abba" is first transformed into the string "xyyx" through the application of rules 1 and 2, then this string is converted into "xzx" since its middle two characters are "yy." This repetitive application of the rules is called recursion. A set of non-terminating rules can be created by defining a rule such as
The string "abba" is first transformed into "xyyx" by rules 1 and 2 but rule 3 transforms it into "abyx." Rules 1 and 2 apply again and "abyx" is transformed back into "xyyx." The process repeats idefinitely without terminating.
It is the purpose of this invention to define a rewrite mechanism which solves the problems of rule ambiguity, priority and termination. It is a further object of the invention to define a way of organizing rules that results in efficient execution for a wide variety of applications.
3. Prior Art
M. E. Lesk, et al., "LEX--a Lexical Analyzer Generator," Comput. Sci. Tech. Rep., 39, Bell Laboratories, Murray Hill, N.J. October 1975. This article describes a pattern matching program where the patterns are associated with program statements which perform actions when the patterns are matched. LEX is not a complete language; it operates in association with a host compiler.
J. A. Manas, "Word Division in Spanish," Communications of the ACM, Vol. 30, No. 7, pp. 612-616, July 1987. This article describes the application of LEX as a rewrite mechanism to hyphenate Spanish text.
B. Brodda, et al., "An Experiment with Automatic Morphological Analysis of Finnish," Department of Linguistics, University of Helsinki, Publ. No. 7, 1981. This article describes the "BETA system"--a finite-state automaton which has string replacement capability. The system has a queueing mechanism to resolve multiple rules that could apply to a specific input string. The format of the rules is table-oriented and there is no provision to prevent infinite loops.
J. P. Hayes, "Computer Architecture and Organization," McGraw-Hill Book Co., New York, 1978, pp. 4-6. This book for new students of computer science describes the principles of a "Turing machine" defined in 1936 by Alan Turing. This machine is basically a state automaton attached to a tape which can be used for any computation imaginable, including the application of rewrite rules.
R. E. Griswold, et al., "The SNOBOL 4 Programming Language," Prentice-Hall, Inc., Englewood Cliffs, N.J. 1970. SNOBOL 4 is a programming language that offers many string matching capabilities which are relevant to rewrite mechanisms. The strategy for matching strings is of interest because the generality of SNOBOL precludes efficient matching. Although SNOBOL is a very powerful computer language, it is notorious for executing slowly.