1. Field of the Invention
The present invention relates in general to multimedia data processing, and specifically to a user mode multimedia processing layer for intelligently and transparently processing multimedia streams in real-time.
2. Background of Invention
Over the past few years, contact established by people with each other electronically has increased tremendously. Various modes of communication are used to electronically communicate with each other, such as emails, text messaging, etc. In particular, real-time video and audio communication (e.g., IM chats including video and/or audio) have become widely prevalent.
For purposes of video and audio real-time chats, cameras (often called webcams) are often connected to a user's computer, and the video and/or audio data captured by the camera is transmitted to the computer. Several options exist for the user to transmit still image, video and/or audio data, such as Instant Messaging (IM), live video streaming, video capture for purposes of creating movies, video surveillance, internet surveillance, internet webcams, etc. Various client applications are available on the market for such uses. For instance, for Instant Messaging alone, a user can choose from one of several applications, including MSN® Messenger from Microsoft Corporation (Redmond, Wash.), ICQ from ICQ, Inc., America OnLine Instant Messenger (AIM) from America Online, Inc. (Dulles, Va.), and Yahoo!® Instant Messenger from Yahoo! Inc. (Sunnyvale, Calif.).
Users often desire to alter the video and/or audio streams in certain ways. Such modifications may be desirable for various reasons. For instance, a user may want to look and/or sound like someone else (e.g., like some famous personality, some animated character, etc.). Another example is when a user simply wishes to be unrecognizable in order to maintain anonymity. Yet another example is when a user wants to look like a better version of himself (e.g., the user may not be dressed up for a business meeting, but he wants to project a professional persona). Still another example is when a user wants to create video/audio special effects. For these and various other reasons, users often wish to modify the video/audio stream actually captured by their webcam and/or microphone. In one example, users have an avatar which they choose. Published US application 20030043153 describes a system for modifying avatars.
Conventional video and audio processing systems are not capable of automatically and transparently performing the appropriate processing functions that may be required for such modification. Existing systems are largely non-transparent, requiring downstream applications to be configured in order to take advantage of video/audio modification capabilities. It is commonly the case that a processing component needs to be integrated into the client application in order to implement such modifications. These processing components are application specific. Alternately, a third-party component needs to be used to proactively add the processed output to the system stream. Yet another alternative is to introduce the video/audio modification capabilities in the driver for the multimedia data capturing device itself. However, the client application would still need to elect to have the effect applied, and the driver for each device would have to be customized to incorporate that functionality. Moreover, advanced processing is not possible in the driver because that environment lacks the proper services needed for such advanced processing. Further, anything in the driver is very static and requires a lot of testing to guarantee system stability, making it nearly impossible to provide a flexible and expandable architecture in the driver. In addition, if the processing functionality is in the driver, backward compatibility with existing devices and drivers cannot be achieved unless a new driver for the device is downloaded by the user.
What is needed is a system and method that can transparently modify still image, video and/or audio streams in real-time, independently of the specific client application that is used, and without needing to modify the device driver.