1. Technical Field
The invention disclosed broadly relates to computer methods, and particularly relates to a computer method for choosing hyphenation points in multilingual text.
2. Background Art
Mechanized hyphenation is a process that is necessary for computerized word processing and printing applications. Mechanized hyphenation has been attempted by the use of stored dictionary systems, rule-based systems, and statistical systems. Dictionary-based systems store the hyphenation for each dictionary entry. Rule-based systems use rules that may apply to more than one word. Rules may be non-specific and apply to any word, or they may be associated with specific words to provide their hyphenation points. Finally, the statistical systems use tables of hyphenation statistics based on a collection of words and apply these statistics to determine the hyphenation of other words. Some of these statistical techniques insert numbers within a word to be hyphenated that indicate the confidence with which the word can be hyphenated at that particular point.
Carlgren describes a system that combines several techniques in IBM Technical Disclosure Bulletin, Vol. 26, No. 11, pp. 6108-6109 and pp. 6095-6096. Carlgren elaborates on the combination of dictionary-based hyphenation and rule-based hyphenation in U.S. Pat. No. 4,574,363. Rosenbaum describes dictionary-based methods of hyphenation in U.S. Pat. Nos. 4,028,677 and 4,092,729. Herzik also describes a dictionary-based method for hyphenating multilingual text in U.S. Pat. No. 4,456,969. Zamora describes a rule-based method (IBM patent application Ser. No. 344,344 entitled "Computer Method for Executing Transformation Rules") for hyphenating text.
Another prior art technique by Moore in the IBM Technical Disclosure Bulletin, Vol. 29, No. 1, pp. 383-384 combines manual selection of hyphenation points and dictionary hyphenation.