1. Field of the Invention
This invention relates to a method and system for reading and displaying a text in a way different from the original form of the text, providing a valuable way of understanding certain characteristics of the text.
2. Description of the Prior Art
Several attempts have been made in the past to create visual or graphical representations of textual works, in order to allow readers to obtain information about the contents of a written text without having to read the words of the text from beginning to end and to thereby understand and gain insight into the text, certain features of the text and its structure.
U.S. Pat. No. 5,556,282 (Middlebrook) teaches the use of cartography to enable one to obtain some comprehension of said text without reading all of the text by mapping a graphic language textscape with regard to typography, graphic or phonetic attributes of selected graphic features, meaning or usage of selected graphic features, statistical analyses of the attributes, meaning, or usage of selected graphic features, or semantic, rhetorical, compositional, thematic, or conceptual configuration. The graphical representation of text within a document is prepared by producing an image of at least some of the text, wherein individual words are indecipherable in the image, identifying at least one common feature contained within the text, such as the physical appearance, phonetics, meaning, usage, definition, location and distribution of text, and segmenting the image into a number of visually distinguishable segments to create a map, wherein each of the visually distinguishable segments corresponds to at least one of the common features in the text, thereby enabling a person viewing the image to comprehend where each common feature occurs within the text without having to read the text.
U.S. Pat. No. 5,713,740 (Middlebrook) also teaches a system for rapidly obtaining information about the contents of a written text without having to read the words of the text by mapping the graphic text image to illustrate, without words, the structure and content of the text with regard to one or more selected features to provide insight into the contents of the text. First, at least one feature within at least a portion of the text is identified, and then at least one representation of that portion of the text is created, wherein the representation of the text does not include any readable words but does include a graphical indication that indicates the presence of said at least one feature at at least one location.
U.S. Pat. No. 5,930,809 (Middlebrook) teaches a method for manipulating text retrieved by a computer so as to allow the user to rapidly obtain information about the text's contents without reading the text. If the text is too large to display on a single screen, a map box is generated on the computer screen and is displayed along with a portion of the retrieved text. Within the map box is displayed a representation of the entire body of text, and a user can use a screen icon to point to any place in the representation of the body of text, which portion is then displayed on the computer screen. The representation of the text in the map box can be mapped in different ways to help inform the user as to the contents of the text prior to it being read.
The Middlebrook patents describe a non-readable, graphical representation of the shape of text portions. This is contrasted with Salton et al., Automatic Analysis, Theme generation, and Summarization of Machine-Readable Texts, Science, New Series, Volume 264, Issue 5164 (Jun. 3, 1994), pp. 1421–26, which describes approaches for manipulating and accessing texts in arbitrary subject areas in accordance with the user's needs, such as by automatically determining text themes, traversing texts selectively and extracting summary statements that reflect text content. In order to show results, Salton et al. use an elliptical display that is merely an outer shell that makes links among the nodes easier for a user to see, and the interior space of the ellipse is not used for content. Salton et al. could just as easily have displayed (and perhaps be more readable) a vertical list of the texts involved, with curved links looping out away from the text, joining related vertices.
The scope of the display of Salton et al. is large text collections, and its purpose is to show how texts or portions of texts are similar. The technology revolves around sophisticated statistical analysis, including complex statistics, math, or scaling procedures, involving creation of weighted term vectors expressing the similarity of all pairs of texts. It also sets an arbitrary lower limit on the display of a link between two texts based on the similarity measure. The intent of Salton et al. is to simplify and screen out most of the text, although it actually shows barely any of the content of the text, relying on automatic techniques to decide what is important.
U.S. Pat. No. 5,793,369 (Atkins et al.) teaches a method for displaying lines of computer source code in a reduced, reshaped or colored manner to determine information about the computer code's structure, use, age, authors or other details. This contrasts with the invention described herein in that the present invention uses a circular layout for the text around the page, and words in the present invention are displayed individually rather than in the text lines in which they originally occurred. It also contrasts in that the primary method of displaying the new information in Atkins et al. is by changing visual attributes in place (without moving lines), whereas the primary method of displaying new information in the present invention is by arranging the positions of the words.
There exist techniques that provide new information about texts by arranging the positions of the words. One such technique is called Multidimensional Scaling, and is described in the book “Modern Multidimensional Scaling, Theory and Applications”, by Ingwer Borg and Patrick Groenen (Springer, ISBN 0-387-94845-7, Library of Congress BF39.2.M85B67 1977). Another such technique is called Self-Organizing Maps, described in the book “Self-Organizing Maps”, by Teuvo Kohonen (Springer, ISBN 3-540-67921-9, Library of Congress). The present invention differs from the above techniques in its particular method of placing words by averaging. The averaging technique used in the current application is considerably easier to calculate and apply, and has an important advantage of being understood by lay people (using a rubberband analogy, detailed below) much more easily.