The communication of information through lectures is fundamental for learning and teaching in academic institutions. Until recently, universities have only been able to offer lectures to attending students, severely restricting the university's reach to the confines of their campus. However, with advancements in technologies for transmitting multi-media over the Internet, some universities now facilitate students with live lecture participation, or facilities to view lecture recordings over the Internet. As well as on-line lectures, academic institutions have recognised the greater opportunities of the Internet for content delivery and on-line video seminar and video conference proceedings are becoming popular. Universities have embraced technology in this way not only to broaden their reach but also to meet the growing demands of students and academics who wish for greater flexibility to learning.
The efforts of universities to provide students with on-line lecture content fits into the domain of eLearning. Within eLearning, the ways in which universities are currently offering video content over the Internet fit into two categories: synchronous and asynchronous. In a synchronous manner some universities offer live video lectures to remote participants. In many cases where lecture videos are provided, students are often given the opportunity to view content in an asynchronous manner or on-demand.
Choosing to offer lectures online is a significant and costly under-taking for any academic institution. Not least of the difficulties associated with this task, is the capturing and editing of video lectures into a suitable form for presentation over the Internet. The expectation among students in relation video lectures is high. The modern student has regular exposure through the Internet and television to professionally edited video content. This sets a high level of expectation among student in relation to video lectures.
There is recent move away from traditional single camera lecture videos towards more dynamic video presentations including shots from multiple cameras. Such productions which aim to capture all visually interesting aspects of lectures are generally agreed to be much more engaging for viewers.
A key component of any lecture or seminar is the conversational interaction among participants, such as that which often occurs between a presenter and an audience. Capturing this information for inclusion in a video lecture production presently requires significant manual editing. In the case where the lecture is to be transmitted live this editing must be performed at the time of capturing usually by large production teams. In the off-line case such editing can be performed as a post-production step but also in most cases requires skilled manual editing.
Automatic systems for editing multi-camera lecture captures do exist such as that proposed by Rui et al. (U.S. Pat. No. 7,349,005). This system incorporates expert video production rules for editing multi-view video data of a lecture and also enables the capture of conversational interactions. The limitation of this system is that active speakers are only tracked in a single view at any given time. Although the system uses multiple cameras, each camera is dedicated to a specific capture task such as capturing the audience or the presenter. The problem with such a configuration is that the success of the system to capture facial view of speakers requires audience members to face a designated camera. This means that speakers are restricted to a defined seating zone which is undesirable. Furthermore, the system can only provide frontal facial views of speakers if they are orientated towards the camera assigned to track them.
It is an object of the invention to provide a system and method for the automated production of a single-view video presentation from a multi-camera capture of a lecture.