1. The Field of the Invention
The present invention relates to systems and methods for streaming video and more specifically to encoding and/or decoding a video stream. More particularly, the present invention relates to systems and methods for encoding and/or decoding a video stream using partial offline encoding, multiple video streams, and/or multiple reference frames.
2. Introduction
The Internet provides access to a large number of different services and resources that can be accessed through a growing number of websites. Search capabilities, customer service, sales, database services, etc. are examples of the services and resources that are provided and accessed over the Internet. In many situations, the websites that provide these services and resources need to interact or communicate with their users.
Effective communication is often vital to enhance the user experience. In fact, the success of a particular website may depend on the ability of the website to communicate effectively with its users. The way a website communicates with its users may have an impact on whether the user returns to the website in the future. Thus, websites are continually striving to communicate with their users in a more effective manner.
At first, websites primarily used text to communicate with users. As the capabilities of websites improved and bandwidth increased, websites began to include images in their communications with users. Today, some websites are beginning to interact and communicate with users using more advanced interfaces. For example, some websites interact with users using multiple images to express emotion. These images may include happy faces, sad faces, confused faces, and the like. These images are accompanied by text that conveys information to the user of the website.
The next step in natural language interfaces is to utilize animated talking faces to further enhance the communication between a website and its users. By improving the interface or communication with users, the services or resources provided by a website become more effective and efficient. Using animated talking faces or face animation, however, is a considerately more difficult and complex undertaking than using simple graphics and text and there are several reasons why animated talking faces are not normally used. Face animation, for instance, currently requires the client to render the animated face on the client's screen. Unfortunately, operating systems and popular media players do not provide support for face animation. Furthermore, downloading and installing a plug-in module is a solution that is unattractive to many users.
Another more problematic aspect of using animated faces is that high-quality face models for sample-based face animation tend to be rather large and are often, on the order of several hundred Megabytes. The sheer size of the sample-based face animation face models effectively prohibits their use on slower Internet connections. The size and magnitude of these face models strain fast Internet connections as well.
One potential solution to this problem is to stream a video from a server to a client using an existing media, player on the client. This does not require as much data as high-quality face models and can use software that already exists on the client. Even this solution, however, may strain the connection between the client and the server. A video stream that has decent subjective quality requires at least 80 kbits/second. This data rate must be maintained continuously even when the face model is not speaking or is waiting on input from a user. Streaming video with decent subjective quality from a server to a client requires excessive bandwidth. In order to meet realtime considerations, motion compensation quality and macroblock mode selection are often sacrificed.