When reviewing documents, especially long ones, readers tend to print the document and mark it up. In doing so, the only mechanism they have for adding comments is to write out the comments on the paper. Often times, they may have views about the material that they do not write since they do not fit well on the document. These views would be able to be attached to the document if the paper was able to capture/record the spoken word, but paper does not. The handling of documents is also complicated by the multimedia and distributed nature of documents in our environment. Multimedia documents are documents that contain or are associated with different types of content or media. For example, a document may exist which could consist of word-processing content, video annotations, or other types of media. This media although associated with the document, is not necessarily embedded in the document and can be found in multiple locations throughout the environment.
In addition to the above, paper is ubiquitous. It is used, for example, in areas as diverse as offices, lecture halls, cars, and dining tables. Paper has been the subject of research in ubiquitous computing interface technologies, both in terms of enhancement and replacement. Some researchers have augmented paper with barcodes, radio frequency identification tags (RFIDs), and other technologies to better identify a document or product and its metadata and to enable the transfer of that information from paper to digital systems. Others are developing paper-like devices, e.g., eInk, which is a paper-like digital display technology, and enhancing more traditional electronic devices with paper-like interaction mechanisms, e.g., pen-based tablets, in an attempt to diminish our reliance on paper that does not intrinsically communicate with the enveloping computing infrastructure.
Various systems have been developed for handling and annotating documents including multimedia documents. For example, U.S. Pat. No. 5,243,149 for METHOD AND APPARATUS FOR IMPROVING THE PAPER INTERFACE TO COMPUTING discloses a unified system comprised of a digitized pen/paper interface and voice recorder for a single user to create annotations (including voice). An article entitled “The Audio Notebook” by Lisa Stifelman, Barry Arons and Chris Schmandt of MIT Media Laboratory and published in the Proceedings of the SIGCHI conference on Human factors in computing systems, p. 182-189, March 2001, Seattle, Wash., United States, is comparable to the U.S. Pat. No. 5,243,149, with emphasis on audio indexing of a lecture, using hand-written notes. U.S. Pat. No. 6,027,026 for DIGITAL AUDIO RECORDING WITH COORDINATED HANDWRITTEN NOTES and U.S. Pat. No. 6,590,837 for APPARATUS AND METHOD FOR ANNOTATING AN OBJECT WITH AN AUDIO MESSAGE disclose apparatus that associated a voice recording to a particular document. These apparati each include a recorder for starting recordings and stopping recordings, file the recordings, and associate the recordings with a document via a barcode that uniquely identifies the document. Other systems such as tablet software with pen-based inputs from various vendors provide purely electronic annotation as does the Adobe PDF PC application with voice annotation. The paper entitled, “Smart-its Friends: A Technique for Users to Easily Establish Connections between Smart Artifacts,” and published in the Proceedings of the Ubicomp Ubiquitous Computing Conference, p. 116-122, September 2001, Atlanta Ga., United States, provides a hardware-assisted pairing mechanism that can be used to create a system out of otherwise unconnected devices.
U.S. Pat. No. 5,243,149 (assigned to International Business Machines Corp.) discloses an apparatus and system for associating annotations, both ink and voice with document pages. The page can be scanned, using a detachable scanner, and stored in a content file. A control file is associated with the content file. Ink strokes are stored and voice recordings can be made, using an explicit start. The entire voice stream is also stored in a file. The control file is augmented with links to the annotation files. All files can be updated for post processing, to digitally display the document with annotations, process the annotations, to add to the content, etc.
Current systems support pen-based annotations using internal media sources. The device acts as a tape recorder, with the user needing to explicitly audibly speak, for example, record. Voice and pen strokes are associated in a post-processing fashion, by converting pen strokes into indexes for the audio stream. Files include a scanned image of the document, audio stream, and index file over a network. The various prior systems that have been developed for annotating documents, including multimedia documents, do not provide sufficient flexibility in how annotations are handled. These systems also do not facilitate handling a range of annotation media types and multimedia document types, including multiple versions of multimedia document types—both digital and physical, that may have been or will in the future be annotated.