Word processors and typesetting layout programs are used to create, edit, store, and output textual documents on a variety of digital computer applications. These computers include, but are not limited to, large mainframes connected to terminals, alone desktop or laptop personal computers, and handheld communication or digital devices. A publisher or designer may specify a format for a text including: typeface, font size, lines per page, the line length, and text margins (left, right, top, or bottom), text density (the ratio of ink to background, e.g., black ink on white), and text and background color. One of the goals of a word processor or layout program is to compose text for the specified format for a variety of output forms such as printing, display on a computer screen, or electronic storage or presentation such as on the World Wide Web. A second goal of word processors or layout programs is to make the text legible—quick and easy to read and comprehend—often within the physical constraints of the media such as the size of the page or screen, or positioning of other elements such as pictures or other text displayed on the page, for example different articles in a newspaper or magazine. A third goal of word processor and layout programs is to make text readable—attractive and pleasurable or interesting to read—by positioning the characters of a text to maintain an evenness or uniformity in various characteristics such as the number of lines per page, the line length, and text margins (left, right, top, or bottom), and text density (the ratio of ink to white space, including controlling line endings and use of hyphenation). A fourth goal is to make text economical—a desired number of lines or pages, e.g., minimum, maximum, or specified number, for the specified format without sacrificing legibility or readability.
In most word processing or layout application, the lines per page, line length, and margins are fixed for a particular unit of text and text density is manipulated to create a text output that is both legible (comprehensible) and readable (aesthetically appealing). Often there is a trade-off in text density between factors that enhance legibility and factors that enhance readability such as uniformity. Readability is usually favored by compositors when setting text. For example, text that is too densely or too sparsely positioned within the space available is difficult to read, but can appear highly uniform. There can also be trade-offs in text density for different aesthetic factors such as maintaining uniform word spacing from line to line (but leaving the right margin irregular or ragged from line to line) verses maintaining a uniform (justified) right margin (but leaving the between word spacing variable from line to line). This is a natural trait of written language because in most languages text units such as words and sentences have variable lengths even when the space allotted is uniform. Thus, there is a variable amount of space available from line to line, and that variable space must be distributed somewhere—either at the ends of lines or within lines. One of the typographic problems faced by either automatically or manually positioning text in a word processor or layout program is how to distribute the text and space such that both legibility and readability are high.
One critical factor that affects the readability of text is the method used to determine line endings because it determines the variation in white space from line to line. A common method used by many word processors is a first-fit approach—also called single line composition—in which the break points for each line are determined one after the other; however, no breakpoint is changed once it has been selected. Another common method is a total-fit line breaking approach—also called multi-line or paragraph composition—developed by Donald Knuth and Michael Plass. This method considers all possible breakpoints in a paragraph and selects the combination of breakpoints with the most globally pleasing result. This is accomplished using a method that determines the badness of each line break by assigning penalties to line breaks that result in spaces that are too large or too small or have other undesirable characteristics such as successive hyphens on adjacent lines. The method minimizes the sum of squares of the badness to achieve a global, paragraph-wide set of line breaks. This method of optimizing line breaks across multiple-lines of a paragraph is used in both free programs such as TeX and the GNU fmt command line utility and commercial programs such as Adobe InDesign.
U.S. Pat. No. 5,724,498 to Nussbaum provides an improved method for justifying text. Conventional methods justify text uniformly by squeezing or stretching the characters and word-spaces, which maintains density within a line, but is often undesirable because density varies noticeably between lines, especially adjacent lines. Nussbaum adds random variation in letter width to conceal these modifications to improve the aesthetic appearance by minimizing the appearance of the character width modifications.
U.S. Pat. No. 7,069,508 to Bever and Robbart provides a method for optimal spacing for readability and comprehension. Bever and Robbart use a library of key words and punctuation to train a neural network to recognize characteristics that recognize phrase boundaries in text and adjust the space size of every between word space according to the likelihood that the space is the end of a phrase.