The present invention relates to apparatus for acquiring and manipulating text and, more particularly, to apparatus for acquiring discrete text strings and automatically formatting those text strings, as they are received, into a preselected data format structure.
There are a number of situations in which a researcher or reader may desire to record and organize printed or displayed text included in an existing work for future reference or for some other future use. As used herein, the term "text" is meant to encompass information that is intended for presentation for human comprehension and may comprise symbols, phrases, sentences in natural or artificial language, pictures, diagrams, and tables. One such situation involves students who, when researching a topic during the preparation of a report or for satisfying a course requirement, encounter numerous reference books on the topic, each of which has an article or text passage that is relevant to the topic of research and each of which would be of value if it were included in the report. Rather than borrowing numerous reference books, or making numerous copies of pages of various reference books, many researchers would prefer to extract the relevant text passages and the bibliographic detail about the reference source in which the text passage was found and enter that information directly into a storage medium in an organized format for later use.
In some cases, because of time constraints, the researcher desires to organize the extracted text immediately and directly into a particular organization format structure, for example an outline format structure where certain text is located in certain positions on a page in relation to other text. For example, the researcher may desire to reproduce a printed phrase and position it as a main topic in an outline structure with a predetermined spacing from the left margin and a predetermined numerical or alphabetical designation preceding it. Additionally, the researcher may desire that the reproduced text be automatically formatted during entry into the format structure with a certain text attribute or attributes, such as underlining, bolding, italicizing, or others.
As used herein, the term "format structure" is meant to refer to an overall structure in which text is organized, including the particular placement of text in a location, such as indentation, margins, justification, the line spacing, and spacing on a page or in a document, the addition of numerical or alphabetical designations or leading characters at lines, particular text attributes assigned to text placed in particular locations, the page spacing, and other format characteristics. The term "text attribute" is meant to refer to a characteristic of text, such as its font, underlining, bold, italics, capitalization, size, etc.
In these cases where the amount of time available is limited, researchers do not have the luxury of being able to perform interim steps on text they find before that text is put into an organized format structure. The ability to directly place the acquired text into the desired format structure is highly desired. No time may be available for transcribing, proof reading, correcting or other similar steps. Additionally, researchers would find a benefit in having certain flexibility in creating the formatted, organized document with the format structure chosen. Researchers many times would appreciate the ability to automatically and manually add text attributes to acquired text, to be able to interlineate comments, and to record details concerning the source material, such as the bibliographic details of the reference source. In many cases, researchers desire to add certain comments concerning the text acquired so that at a later time, the researcher's thoughts or ideas concerning that text or its use or background would be available. It is desirable that such comments are clearly and automatically indicated as such by the system by attaching some attribute to it, such as by automatically placing a border around the comments.
Rather than hand writing the desired text into the format structure, which can be a time consuming and burdensome activity especially when the desired text is quite lengthy, it would be preferable to directly acquire the text via optical or other means from the source material and enter it directly into the selected format structure and into a computer compatible storage medium, such as a computer readable magnetic disk. Even acquiring text by dictating it into a dictation machine for recording on magnetic tape will not allow its direct entry into the format structure. The tape must later be transcribed at which time errors may be made, such as spelling errors, punctuation errors, and others. Poor articulation by the dictator or poor recording quality can result in a poor transcription result. The additional steps of transcribing and proof reading would be necessary to assure that the desired text was reproduced accurately and these additional steps require time that may not be available. Furthermore, if the dictated text must be manipulated again later, it must at some time be entered into electronic form to be compatible with a computer, which also exposes the work to errors during keyboard entry. Although the text can be directly entered into a computer during transcription, these errors may still be made.
Yet another conventional method used to acquire desired text is to manually input the text directly into a computer via a keyboard. A particular program, such as a word processing program, may be loaded into the computer and the text "keyed" into the computer and manipulated by the program to yield the desired organization of the text. However, this method suffers the shortcoming that the rate at which text may be input into the computer is dictated, and thus limited, by the speed at which the person types. Typing errors may be made and must later be identified and corrected. Even though many computers are portable and can easily be carried into libraries and other research facilities, these facilities, and particularly libraries, may not permit the noise resulting from the use of a keyboard (or dictation). Thus, the keyboard method of text acquisition is inefficient and time consuming, especially when the desired portion of text is quite long, and may not even be allowed in some research facilities.
Word processing programs allow users to manipulate text into almost any conceivable format structure; however, the text must first be put into an electronic form before the computer can manipulate that text. Typing the text with a keyboard has certain disadvantages as discussed above. Additionally, most word processing programs are general in nature and require significant effort to organize text into a particular format structure. For many researchers, having to spend time to learn how to operate a complex word processing program to perform a research support task is an undesirable exercise. It would be much more desirable for the researcher to have available a text acquisition and organizing system that has available a plurality of preprogrammed format structures, all of which are useful for research purposes, with additional flexibility for manipulating text, and all of which may be actuated with the press of only one or at the most only a few key strokes.
Although many word processing programs include an outline subroutine for organizing text into an outline format structure, other text format structures are not preprogrammed. In many cases, researchers are not satisfied with only an outline format structure and desire to organize text into other format structures. Furthermore, due to the desire of the creators of word processing programs to remain general and appeal to as wide a customer base as possible, the word processing programs discussed above do not include automatic prompting for the bibliographic details of the reference source, although they typically do include the ability to insert footnotes. Along the same lines, although many word processing programs include the ability to insert "text" boxes or other types of boxes that are set apart from surrounding text by some indicator, such as a partial border, creating such boxes takes significant effort requiring in some cases a selection of the type of border, the exact location of the box, and other box characteristics. For those who are pressed for time, a requirement to make multiple choices just to create a box is highly undesirable. It would be preferable to merely press a single key to create the box in which text is placed.
Obtaining and displaying text from other computers, such as text received from a remote computer via a modulator-demodulator ("MODEM"), can involve the acquisition of much extraneous text which requires extra steps to sort out. In the case of the word processing program, displayed text must first be selected, such as by "blocking" it with a cursor, then copied to a temporary text buffer, sometimes known as a "clipboard", and then recopied from the clipboard into the desired format structure. The interim step of using the clipboard increases the time required and exposes the text to loss if the correct procedure is not followed exactly.
There exist scanner devices that include optical assemblies that may be scanned across written text of interest to create digital image patterns corresponding to the text. An optical character recognition ("OCR") program may be used to interpret the digital image patterns and convert such patterns into a machine-readable character code such as ASCII to create a file that may be used by a computer to manipulate the text. However, uploading such text into a computer and manipulating such text with a word processing program or the like in order to then organize the text into the desired format structure involves the additional steps described above.
Hence, those skilled in the art have recognized a need for a text acquisition and organizing system that can acquire text directly from a text source to place it into a digital, electrical format and directly enter the text into a selected format structure thus avoiding additional manipulation steps. The need also exists for such a device having a plurality of selectable text format structures into which the acquired text may be automatically formatted as the text is received. Such automatic formatting into a structure should also include the automatic attachment of text attributes as well as the ability of the user to manually apply attributes and enter additional text that is automatically formatted in a desired fashion. A need has also been recognized for such a text acquisition and organizing system to be easy to use with as few required actuation commands as possible. The present invention fulfills these needs and others.