1. Field of the Invention
This invention relates to the presentation of electronic documents and more particularly to the protection of electronic text.
2. Description of the Related Art
Documents are increasingly being represented as digital bits of data and stored in electronic databases as electronic documents. These documents often appear as electronic versions of articles, newspapers, magazines, journals, encyclopedias, books, and other printed materials. Such electronic documents are typically comprised of miscellaneous strings of characters, words, sentences, paragraphs, or documents of indeterminate or varied lengths and may include a wide variety of data classifications, such as alphanumerics, symbols, graphics, images, pictures, audio or bit sequences of any sort and combination.
Electronic documents are easily available and accessible by electronic devices and may further be republished by electronic devices with astonishing ease and expediency. Suitable electronic devices include, for example, computers, personal digital assistants, cell phones and other devices having processors, memory and display capability. These electronic devices may access the electronic documents over the Internet with a browser by downloading them onto a hard drive or other memory media. Alternatively, the electronic devices may access electronic documents that have been stored on memory media, such as CD-ROM, by downloading them from the memory media. Typically, a computer may be used to display the document on a monitor.
Modern word processing and text editing programs useful for displaying and editing such documents employ structured document architecture to provide greater control and flexibility in the displayed and printed appearance of documents prepared with the programs. Structured document architecture is described, for example, in the U.S. Pat. No. 5,285,526, which is fully incorporated herein by reference A structured document can be prepared in accordance with the standardized general markup language, such as is described by the International Organization for Standards in its Standard 8879-1986. A data stream of text marked up in accordance with the standardized general markup language, will have its text divided into elements consisting of a begin tag, then its content and then an end tag to terminate it when necessary. Within a WYSIWYG (what you see is what you get) editor, text is displayed to the user as it will appear when it is printed, even though its structure is defined by the begin tags and end tags for each element of text. Formatting of the elements within a structured document is done when the document is displayed to the user. Those elements which fall into the category of commonly used elements include, for example, paragraphs, simple lists, ordered lists, bulleted lists, and list items.
FIG. 1 is a schematic diagram of a structured document with element tags and associated text. The document elements 28-38 are organized in a formatted text stream having an ordered sequence, using structured document notation, where the ordered sequence is specified by a corresponding ordered sequence of the plurality of element tags. Each element shown in FIG. 1 is a structured document element having a begin tag and an end tag including, for example, the first paragraph 28, which includes a begin tag [p], the text of the paragraph and then an end tag [/p]. The begin tag and the end tag serve to identify the element type, in this case indicating a paragraph. The order of occurrence of the WYSIWYG display of the first paragraph 28 on a display is determined by the order of occurrence in the formatted text stream of the structured document element representing the first paragraph 28 within the context of the order of occurrence of the other structured document elements, as for example, the list 30 and the second paragraph 38.
FIG. 2 is a schematic diagram of a memory organization of element tags and associated text. The order of occurrence of the structured document elements such as the first paragraph 28, the list 30 and the second paragraph 38, can be determined by their order of occurrence in the formatted text stream that is stored in a computer memory 22. FIG. 2 shows that the structured document text, with its tags of FIG. 1, has been stored in the memory 22 in a linear sequential order, which is the formatted text stream 25. Paragraph element 28 includes tags 28a and 28b and text 28c. List element 30 includes tags 30a and 30b. List item element 32 includes tags 32a and 32b and text 32c. List item element 34 includes tags 34a and 34b and text 34c. List item element 36 includes tags 36a and 36b and text 36c. Paragraph element 38 includes tags 38a and 38b and text 38c. 
Authors and publishers place considerable proprietary value on the textual passages that they generate (e.g., newspaper and magazine articles). However, the ease in which textual passages can be duplicated in electronic storage media presents a problem in that such passages can be copied and/or incorporated into larger documents without proper attribution or remuneration to the original author. This duplication can occur either without modification to the original passage or with only minor revisions such that original authorship cannot reasonably be disputed.
Publishers have a compelling need to ensure that all manuscripts that have been submitted to them for publication are, in their entirety, original works of authorship. Alternatively, for those manuscripts that are not original works of authorship in their entirety, the publishers have a compelling need to ensure that all copied portions are properly cited as to their original source and, when necessary, that the owner of each copied portion has granted permission for the inclusion of the copied portion in a new publication.
Furthermore, authors and researchers often conduct research to obtain a large quantity of information gathered from other sources, such as documents, books and articles. Often the quantity of the gathered information becomes so large that the author-researcher becomes overburdened with maintaining the source attribution for some of the gathered information, resulting in an embarrassing accusation of plagiarism after the author's work has been published that includes portions not properly cited to an original work. Even though the plagiarism may have been inadvertent, such accusations of plagiarism may still cause extensive damage through embarrassment, damage to reputation, loss of scholarly credit and financial detriment.
Thus, there is a need for methods and systems to properly protect original works from being copied in a manner not acceptable to the owner and to protect publishers from accepting manuscripts that fail to provide proper attribution to the original authors or fail to provide the required consent to publish from the original owners. Additionally there is a need to provide bibliographic systems and methods that help protect authors from committing inadvertent plagiarism.