Computer objects such as texts or images are very often cut or copied from one document e.g., from a web page, and pasted into another document e.g., in a Lotus WordPro document (Lotus and WordPro are Trademarks of International Business Machine Corporation). Different types of objects, such as text portions, images, or audio clips, can be copied by a user from multiple source documents and pasted into an object document. It is a common practice today, for many people, to compose documents including portions that are “imported” i.e., copied and pasted, from another documents e.g., from web pages accessed through the Internet.
Most of modern word processing application programs allow a user to copy blocks of text from different documents and to transfer them to another document. Copying an item such as a block of text from a first document into a second document is generally referred to as a “copy and paste operation”. When an item is copied from a source document, it is generally stored in a temporary buffer called a clipboard. This allows the user to later paste the item into the desired object document, at the right location. The action of transferring the copied item to a determined location of the object document is referred to as “paste”.
Authors and publishers place considerable proprietary value in their creations and in particular, in the textual passages they generate e.g., in newspapers and magazine articles. Unfortunately, the ease with which textual passages can be duplicated in electronic storage media presents the problem that such passages can be copied and/or incorporated into other electronic documents without proper attribution or remuneration of the original author. This copy may occur either without modification of the original passage or with minor revisions such that original authorship cannot reasonably be disputed. Furthermore, authors and researchers often have the need to locate the sources of given passages cited in documents, but frequently do not know the title, author, date of publication, or other identifying features of the original work. As a consequence, unless the user has an exact quotation, it can be very difficult to find the source of the passage in order to give proper recognition to the original author.
When objects such as text portions are copied from one or from several source documents into an object document, source information is the information required to identify the source documents from which each one of said text portions have been copied. Source information may include, for example, address where the document can be found, copyright information, authorship information, references to contract's terms and conditions, citations and footnotes. When portions of documents are copied through networks, such as the Internet, source information may include, for instance, the Uniform Resource Locator (URL) of a web page from which a text portion has been copied.
According to the prior art, several systems and methods exist for providing source information of an object copied from a first document and inserted into a second document. For example, U.S. patent application Ser. No. 10/165,083, by Keohane et al., discloses a method, apparatus, and computer instructions for automatically generating source information for an object that is cut or copied from a document and inserted into another document. The source information can be stored, hidden, or pasted into the destination document, and can also trigger automatically the generation of a footnote for the destination document.
An important limitation not solved by Keohane et al., nor by the other known methods for providing source information of copied textual objects, lies in the lack of persistency of the source information. By lack of persistency of the source information, one should understand that, if an object e.g., a portion of text, copied by a user from a source document to an object document, is itself edited by the user in the second document e.g., a portion of the copied text is modified, or if a sub-portion is cut and pasted by the user into a different paragraph of the object document, the source information associated to the copied portion, and the generated sub-portions, is lost.
The traceability and the persistency of copied objects is an important issue for intellectual property protection and copyrights enforcement. As it is widely established by copyright laws in most countries, material paraphrased or summarized from other sources should be, clearly indicated as such, and it should be clearly distinct from the author's own statements and credited to the original source.
Moreover, not merely to enforce copyrights protection, but also for the purposes of authoring, documenting and referencing edited materials, when the copy and paste process is used during a document edition, it would be very useful to automatically create a link, or hyperlink, from each textual portion copied into an object document, to the source document from which said textual portion has been copied. Furthermore, it would be required not only to automatically associate links, or hyperlinks, from copied textual portions to the source information, but also from all textual sub-portions or textual fragments that could be generated therefrom when editing the object document.
Therefore, there is a need to provide a method and systems for identifying imported textual objects which have been copied or have been generated by editing textual objects already copied from other source documents. There is also a need to provide a method and systems for referencing and accessing, from imported textual objects, copied from different documents, or originated by editing text already copied from different documents, the source documents from which they have been copied.