When a user types text into a word processing application program, or other type of page layout application program, the program will lay the typed text out line-by-line and will periodically “break” the flow of text at appropriate points in order to move to a new line. This line breaking process typically results in one or more paragraphs, each of which is defined by one or more lines having breakpoints marking the beginning and ending of each line. One common problem that arises in word processing application programs and other page layout programs is the problem of determining where in the paragraph each line should be broken.
Most word processing applications break text line-by-line and do not consider the formatting of adjacent lines. For instance, most word processing applications begin by formatting the first line from the first character of a paragraph and finding the best line break for the line. Some factors that may be taken into account when locating the best break include whether a paragraph is ragged-right or justified, whether compression is permitted on the line, whether hyphenation is permitted, and other factors. After locating the best break for the first line, the application continues formatting the second line with the first character after the break of the first line in a similar manner. Each subsequent line is formatted in the same way.
An optimized paragraph layout algorithm was developed for the TeX program by Professor Donald Knuth. The algorithm developed by Knuth considers all possible ways to break a paragraph into lines. In particular, the algorithm calculates a penalty function to evaluate the quality of each way of breaking the paragraph into lines. Based on the results, the algorithm chooses the best way to break the paragraph. The approach set forth by Knuth improves the typographic quality of text by improving the uniform distribution of white space between lines for justified paragraphs and the appearance of ragged-right paragraphs. In order to accomplish these benefits in linear time, Knuth's algorithm applies techniques of dynamic programming.
Although the algorithm provided by Knuth does provide a number of benefits, it is not without its drawbacks. In particular, the Knuth algorithm can only break text for a page having a predefined geometry. The algorithm does not provide for the inclusion of figures that are attached to particular points in the character stream and can be positioned anywhere on the page, changing the geometry of the page during formatting. Because the algorithm provided by Knuth only allows justification (compression or expansion) between words, it does not operate with text in languages that allow justification inside words or languages that do not have white spaces between words. Moreover, the Knuth algorithm fails to produce an acceptable formatting result in certain typographically bad cases-usually in narrow columns with a small number of justification opportunities. In these cases, the Knuth algorithm simply produces lines that overflow the right margin and informs the user about the error. In the context of a word processing application program it is unacceptable to provide such a result to an end user.
It is with respect to these considerations and others that the various embodiments of the present invention have been made.