1. Field of Technology
The present application discloses the use of video analysis technology (such as that described in part in U.S. Pat. No. 6,940,998 (the “'998 patent”), the disclosure of which is incorporated herein by reference) to analyze video data streams from cameras in real time, determine the occurrence of a significant event, and send a notice that may include a selected segment of the video data associated with the significant event. The application also discloses the use of video analysis technology to determine whether video data includes content corresponding to a user preference for content and providing at least a portion of the video data including the content of interest to be made accessible the user.
2. Background of the Invention
While the means to capture, transport, store, retrieve and display video in large-scale networks have advanced significantly in recent years, technologies available and practical for characterizing their content as one does for other data types have not kept pace. The video equivalent of the search function in a word processor has not been offered. Video analytics technology, which is the adaptation of advanced computer vision techniques to characterize video content, has been limited to highly sophisticated, expensive implementations for industry and government. Furthermore, existing video analytics techniques require large amounts of processing capacity in order to analyze video at or near real time. Present technologies available to consumers and small business users do not provide sophisticated, adaptable and practical solutions. Existing systems cannot be made to operate effectively on off-the-shelf personal computers due to the limitations of processing capacity associated with such platforms. Additionally, existing closed circuit television systems are limited to dedicated network configurations. This is due to the high bandwidth requirements associated with streaming live video. The requirement for a dedicated network inhibits distribution of collected video beyond a location close to the video imager or camera. Existing technologies for video transport require too much bandwidth to effectively be employed across readily-available networks having low bandwidth capacity, such as most wireless networks.
Additionally, current systems for viewing live or recorded video require that the user know the location of, or the path to, the desired video stream on the network or within the closed circuit system and actively “pull” the video in order to view it. In the case of large, loosely organized libraries of live or recorded video, this task may be extraordinarily onerous, usually requiring viewing many scenes containing nothing of interest to the user. One recent advance has been to use the output of electronic sensors to trigger the transmission of video from a nearby camera. Some video systems even incorporate “video motion detection,” a technique that senses gross image changes, to initiate this action. These systems offer no way to determine the relevance of content or to distinguish between non-activity and events of interest. The distinction between what is of interest and what is not must be performed by a human. This activity can be characterized by long periods of inactivity punctuated by rare but sudden episodes of highly significant activity requiring the application of focus, careful consideration and judgment. In the case of real-time observation systems, significant events will in all likelihood go unnoticed by the user. These situations are thought to contribute to the slow adoption of “nanny-cam” systems. They also limit the ability of online content providers to create convenient video distribution services for new classes of mobile phones and similar communication and display devices.
Because existing systems need to be installed in a dedicated network, they do not have the flexibility to accommodate the dynamics of a rapidly-developing or transient situation. In addition, existing systems typically send a video representation of observed location to one end-point. In some cases, a user will want to have the ability to change the recipient of video data from an observed location.
Traditional closed-circuit TV systems require that a person sit at a display screen connected to a network in order to observe a location. If a user wants to be able to see what happened in his or her absence, he or she must watch the video of the period of his or her absence. This can be inconvenient, time consuming, and boring. To mitigate these effects, a user may choose to view the recordings at an increased play speed. This can increase the chances that something of significance will be missed. This situation limits the ability of the user, such as a homeowner or small business owner, to have peace of mind when the user must be away from the video display.
Existing video data storage and retrieval systems only characterize stored material by information or metadata provided along with the video data itself. This metadata is typically entered manually, and only provides a single high level tag for an entire clip or recording, and does not actually describe the content of each scene or frame of video. Existing systems do not “look inside” a video to observe the characteristics of the video content in order to classify the videos. Classification therefore requires that a human must discover what content a video contains, usually through watching the video or excerpts therefrom, and provide tags or other descriptive information to associate with the video data. This process can be time- and energy-intensive as well as extremely inefficient when dealing with large amounts of video data, such as can be encountered in cases of multiple, real-time streams of video data.
What is needed, then, is a video content description technology that enables distributed observation of user-defined video content across existing networks, such as the Internet and wireless communication infrastructure, and observation across multiple geographically-distributed sites. What is also needed is a system that automatically forwards video to interested personnel in response to the existence of noteworthy events and that allows flexibility to specify and change the recipient of video data. What is further needed is a system that can send notifications and information to users wherever they are.