1. Field of the Invention
This invention relates to an integrated multimedia notetaking system. The invention is more particularly related to a notetaking system that utilizes digital video and ink as references and notes. The invention is further related to a notetaking system utilizing video feeds for provision of illustrative notes, book marking, and indexing material. The invention is still further related to the indexing of at least one of notes and a video feed via the use of thumbnails, timestamps, and background snaps. The invention is yet further related to a notetaking system having a slide detection process for automatic notetaking, and as a feed mechanism for frame rate compression for optimizing bandwidth when presenting material to the notetaking system.
2. Discussion of the Background
Multimedia notetaking systems typically capture audio and video during a meeting and slides are created from the captured material (For example, Tivoli, a system running on LiveBoard, see Moran, T. P., Palen, L., Harrison, S., Chiu, P., Kimber, D., Minneman, S., van Melle, W., and Zellweger, P. xe2x80x9cI""ll get that off the audioxe2x80x9d: a case study of salvaging multimedia meeting records. Proceedings of CHI ""97 CM, New York, pp. 202-209). Tivoli is designed to support working meetings rather than presentation meetings. The ink strokes in Tivoli, which are indexed to the audio, along with any prepared material on the Tivoli slides become the group notes to the meeting. A participant using a laptop may xe2x80x9cbeamxe2x80x9d typed text comments onto a slide in Tivoli.
In a similar example, Classroom 2000, images of presentation slides and audio are captured, but video is not used (see Abowd, G. D., Atkeson, C. G., Brotherton, J., Enqvist, T., Gulley, P., and LeMon, J. Investigating the capture, integration and access problem of ubiquitous computing in an educational setting. Proceedings of the CHI ""98 Conference. ACM, New York, pp. 440-447; and Abowd, G. D., Atkeson, C. G., Feinstein, A., Hmelo, C., Kooper, R., Long, S., Sawhney, N., and Tani, M. Teaching and learning as multimedia authoring the classroom 2000 project. Proceedings of the ACM Multimedia ""96 Conference. ACM, New York, pp. 187-198). In addition, Classroom 2000 requires effort by the presenter to prepare the slides in a standard graphics format. The slides are displayed on a LiveBoard and note-taking is done with PDA devices pre-loaded with slides These notes are later synchronized to the audio and the slides which have been annotated by the professor lecturing in front of the LiveBoard.
In yet another example, the Forum (see Isaacs, E. A., Morris, T., and Rodriguez, T. K. A forum for supporting interactive presentations to distributed audiences. Proceedings of CSCW ""94. ACM, New York, pp. 405-416), is a system uses video as a means for distributed presentations. Everyone, including the speaker, sits in front of a workstation during a presentation. Slides have to be prepared in a specified format. The slides can be annotated with text and marks drawn with a mouse, but the video images cannot be annotated.
In another example, the STREAMS (see Cruz, G., and Hill, R. Capturing and playing multimedia events with STREAMS. Proceedings of the ACM Multimedia ""194 Conference. ACM, New York, pp. 193-200), is a system for presentation capture that uses video from room cameras. These cameras are also used to capture any presentation content on display. This method has problems when activity in the room obscures the display. Note-taking during the presentation is not supported, although the captured video streams can be annotated during review by adding text comments. None of these systems allow interactive integration of live images from cameras and presentation material into the notes.
In addition, there are also several known stand alone ink and audio note-taking systems. For example, FXPAL Dynomite (see Wilcox, L. D., Schilit, B. N., and Sawhney, N. Dynomite: A Dynamically Organized Ink and Audio Notebook. Proceedings of CHI ""97. ACM, New York, pp. 186-193); and Audio Notebook (see Stifelman, L. The Audio Notebook: Paper and Pen Interaction with Structured Speech. PhD Thesis. MIT, 1997), which uses paper with audio recording. Filochat (see Whittaker, S., Hyland, P., and Wiley, M. Filochat: handwritten notes provide access to recorded conversations. Proceedings of CHI ""94. ACM, New York, pp. 271-276), is a PC computer with a pen tablet in which audio is indexed with handwritten notes; and NoTime (see Lamming, M., and Newman, W. Activity-based information technology in support of personal memory. Technical Report EPC-1991-103, Rank Xerox, EuroPARC, 1991), was designed to key the user""s ink strokes to recorded audio or video.
Also known are video annotation systems. Marquee (see Weber, K., and Poon, A. Marquee: a tool for realtime video logging. Proceedings of CHI ""94. ACM, New York, pp. 58-64) is a pen-based system for making annotations while watching a videotape. A later version of Marquee has modifications to take timestamps on digital video streams from the WhereWereWe multimedia system (see Minneman, S., Harrison, S., Janssen, B., Kurtenbach, G., Moran, T., Smith, I., and van Melle, B. A confederation of tools for capturing and accessing collaborative activity. Proceedings of the ACM Multimedia ""95 Conference. ACM, New York, pp. 523-534).
Vanna (see Harrison, B., Baecker, R. M. Designing video annotation and analysis systems, Graphics Interface ""92. Morgan-Kaufmann, pp. 157-166); and EVA (see MacKay, W. E. EVA: An experimental video annotator for symbolic analysis of video data, SIGCHIBulletin, 21 (2), 68-71. 1989. ACM Press) are text based systems. VideoNoter (Trigg, R. Computer support for transcribing recorded activity, SIGCHI Bulletin, 21 (2), 68-71. 1989. ACM Press) displays and synchronizes different streams of activity (video, figures whiteboard drawings, text), but requires post-production to transcribe text from the audio or extract drawings from a whiteboard. These systems are limited by their design based on using videotapes rather than digital video. None of these systems allow interactive integration of video images into the notes. Sharp Zaurus (Zaurus Operation Manual. Sharp Corporation, 1996) is a commercial product, which is a PDA with a digital camera attached. Digital photos can be taken and linked to handwritten notes.
The present inventors have realized that note-taking is a common activity that can be made more powerful with video. The present inventors have also realized the need to provide a fully integrated digital video and ink notetaking system.
Accordingly, it is an object of the present invention to provide a multimedia notetaking system.
It is another object of the present invention to provide a notetaking system that allows the user to annotate images captured from a media stream input to the notetaking system.
It is yet another object of the present invention to provide a notetaking system having a timeline that identifies significant events occurring during a notetaking session.
It is yet another object of the present invention to provide a notetaking system that captures live multimedia steams and utilizing framerate compression to provide the multimedia streams to a notetaking device and to automatically summarize events such as slide changes into a timeline.
And it is still further yet another object of the present invention to allow a user to bookmark points in a captured multimedia stream in a notetaking device.
These and other objects are accomplished by a system for note-taking with digital video and ink (also referred to as NoteLook). The invention includes a notetaking device that includes, a media input mechanism configured to retrieve at least one media stream, at least one user input mechanism configured to accept user inputs, and a control device configured to allow the user to at least one of manipulate, connote, and summarize the at least one media stream via the user inputs.
In one embodiment, NoteLook includes a client application that runs on a pen-based notebook computer. NoteLook has a display with a main area resembling a paper notebook page for writing, capturing, and annotating images. There is a small video window for viewing the active video. The user may change channels to view different video streams. The user can grab a frame that is showing in the video window as a small thumbnail in the margin of a note page or as a large background.
The thumbnails, background images, and ink strokes are time stamped and provide indexes into the video. The video source is handled by a NoteLook server, which runs on a computer that has the video input. The NoteLook server also transmits the video, audio, as well as meta data (times of slide changes, or speaker changes, for example) to the NoteLook client application, typically via a wireless or wired network. These streams of multimedia data are archived by the NoteLook server and can be randomly accessed by the clients during playback. Multiple instances of NoteLook clients and servers can operate together. The video source can be captured in a variety of ways: from a room camera or document camera, from a tap into a rear projector, TV, VCR or any video stream, or from a small portable camera attached to a pen computer.
In meetings, presentations and classes, the NoteLook digital video and ink note-taking system can be used to snap still images of the speaker, room activity, and presentation material and integrate them into the notes. The snapped images and ink strokes can be timestamped and linked to the recorded video for easy browsing and retrieval. Video can capture gestures, nonverbal activity, and show context. Video provides a versatile means of capturing the presentation content in a variety of forms. PowerPoint slides, Web pages, overhead slides, whiteboard, and more dynamic media such as animation and video can all be captured with video.
Demonstrations during presentations and training sessions are also effectively captured by video. In order for a digital ink and video note-taking system to be usable, it must be unobtrusive for the note-takers and other participants in the room and require minimal preparation from the speaker.