This invention relates generally to the field of media processing systems and, in particular, to a method and system for integrating arbitrary isochronous processing algorithms into general purpose media processing systems.
Current continuous media systems include servers, which are designed to maximize the number of concurrent streams and to ensure quality of service to those streams which are being serviced, and clients, which are designed to receive media streams and render said streams as a multimedia presentation to the client system user. The problem is that as digital media becomes more commonplace, users require solutions with much greater interactivity. Examples of interactivity requirements in processing media streams include the following: the ability to customize a presentation as it is being presented, both by end-users and by the presenter (or transmitting station) according to the resources available; special security features, including the ability to encrypt or scramble presentations; the ability to watermark individual copies of an audio/video object as it is being transmitted; the loading of certain types of audio/video objects which may require specialized processing so that the object can later be streamed; the extracting of content information from encoded video such as Query by Image Content (QBIC) or speech to xe2x80x98scriptxe2x80x99 conversions which require preprocessing of video/audio data for later use; implementations of browsing support which may require real-time processing or may require processing by switching between multiple versions of a given audio/video object; and, the ability to adapt the audio/video stream to changing network conditions in ways that do not disturb the end-user.
There are numerous publications describing algorithms to perform some of the foregoing functions including the following: C.-Y. Lin and S.-F. Chang, xe2x80x9cIssues and Solutions for Authenticating MPEG Videoxe2x80x9d, January 1999 http://www.ctr.columbia.edu/xcx9csfchang; xe2x80x9cCompressed Video Editing and Parsing System (CVEPS)xe2x80x9d, (itnm.columbia.edu); Ketan Mayer-Patel, Lawrence Rowe (cs.berkeley.edu), xe2x80x9cExploiting Temporal Parallelism for Software Only Video Effects Processingxe2x80x9d, and, Meng, J., Cheng, S. F., xe2x80x9cTools for Compressed Domain Video Indexing and Editingxe2x80x9d, SPIE Conference on Storage and Retreival for Image and Video Database., Vol 2670, San Jose, Calif. 1996. There are also publications describing methods for extending programming languages such as Java* (all asterisks indicate that the terms may be trademarks of their respected owners) or C to ease the burden of processing video, such as Smith, Brian. xe2x80x9cDali, A High-Performance Multimedia Processing Systemxe2x80x9d, http://www.cs.cornell.edu/dali; and A. Eleftheriadis, xe2x80x9cFlavor: A Language for Media Representationxe2x80x9d, Proceedings, ACM Multimedia ""97 Conference, Seattle, Wash., November 1997, pp. 1-9. A problem with implementing prior art solutions is in integrating these algorithms and mechanisms into media processing systems in a generic manner while continuing to adhere to the quality of service provisions required by media processing systems. Existing media processing systems fall into two categories: closed systems which allow no user-written stream processing modules and constrained systems which provide limited interfaces for user-written stream processing modules.
Closed server systems provide the ability to store, manage and stream continuous media files to network connected clients, but do not allow user-written modules to manipulate the media as it is being streamed. Likewise, a closed client system does not allow user-written modules to manipulate the media as it is being received from the server and presented to the client system user. The IBM VideoCharger* Server is a generally available product which is an example of a media server that does not allow user-written stream processing modules. Like many other media servers in this category, the VideoCharger server provides quality of service guarantees. Because media must be streamed at a specific, continuous rate, the server must not attempt to service any client requests which might cause the server to exceed its capacity and thereby degrade the quality of streams which are already being serviced. Thus, in order to provide quality of service guarantees, servers must appropriately manage resources and control admission of new clients. Constrained systems also provide limited support for user-written stream processing modules which may process the media data as it is being streamed or presented. One example of a constrained server is the RealNetworks G2 Server which supports plug-ins. However, these plugins are limited to specific functions such as an interface to a file system, an interface to the network, or file formatting for specific media type. The server does not support an arbitrary series of processing modules. Examples of processing which would benefit from a less restricted environment for processing modules include trick modes, encrypting, watermarking or scrambling streams, and multiplexing or demultiplexing of live streams. This solution is further constrained by the lack of distributed stream control. Also, the server does not provide interfaces for ensuring quality of service for arbitrary processing modules. For example, a plug-in is allowed to choose not to send portions of a media object if it receives feedback indicating the client is unable to process the stream at the current bit rate, however, the capability to limit the number of streams accepted for processing to an amount which can realistically be processed within an acceptable threshold of degradation or while maintaining server stability is not provided.
As another example of a solution which is constrained from the quality of service and distributed stream control perspective, Microsoft* provides the ActiveMovie* (DirectShow*) programming interface and client software. DirectShow does provide a rich programming interface including providing for client processing modules, but does not provide for server-side processing modules. Also, because the interface is geared toward client software, the infrastructure does not address management of a very large number of disparate processing components while maintaining quality of service. Rather, the DirectShow system attempts to dedicate all resources at the client system to providing a single multimedia presentation. Furthermore, because the interface is geared toward client software, control is provided on a filter (a.k.a., module) basis, which would be inadequate for supporting server-side processing and for graphs which are distributed over multiple systems. For example, state control commands, such as pause, are realized on a per-filter basis and the communication interface between filters is limited to notification of changes in filter state. In the server environment, applications will often require stream level control interface, but cannot realistically operate on a per filter basis.
Thus, what is needed is a direct solution to the problem of providing support for arbitrary isochronous processing algorithms in general purpose media processing systems while maintaining the ability for that system to provide quality of service guarantees.
What is further needed is a system and method for dynamically inserting arbitrary processing modules for stream, load and parsing of various media while adhering to the admission control and resource reservation commitments required to ensure quality of service for those streams.
What is further needed is a system and method for allowing these media processing modules to be distributed over multiple systems as required to provide optimal service to interactive multimedia presentations.
The invention comprises a flexible and efficient mechanism for integrating arbitrary isochronous processing algorithms into general purpose media processing systems by providing an infrastructure and programming model for the dynamic insertion of one or more isochronous processing modules (filters) into a media stream. The method and apparatus will substantially enhance the ability of a media processing system to store and stream various media formats under a variety of conditions. The disclosed system supports generic graphs of processing modules through novel and efficient mechanisms for buffer management, distributed stream control, and quality of service management. The inventive buffer management mechanism ensures that data is efficiently carried through pipelines of filters. The buffering structure and use of pipeline heuristics allows filters to modify data in place, reformat the data, add or remove data from buffers, and duplicate, split or join buffers with minimal copy requirements. The incremental method by which these heuristics are collected and supplied to relevant nodes ensures that information can be easily and accurately retrieved. The distributed stream control mechanism of the present invention allows control commands to be accepted for a given output channel and then for the requested command to be effected at any one or a plurality of filters along the pipeline which is producing data for this output channel. The quality of service mechanism enables abstract representation of resource requirements for arbitrary media objects and their associated processing modules so that admission control, resource reservation and load balancing may be maintained by the server. The generic graphs of processing components provided for in this invention are an essential factor in solving the problem of implementing arbitrary media processing algorithms in a general purpose media server as they provide a method for interconnection of disparate processing components, and they facilitate communication between those components in a way that does not require the components to have full understanding of complex graphs, or even of the directly connected components, and that does not conflict with the time-dependent nature of continuous media data. By distributing stream control and decoupling resource management from components responsible for processing the media stream, the disclosed system allows these generic graphs to be constructed over multiple, networked systems.