Recent advances in computing power and related technology have fostered the development of a new generation of powerful software applications. Gaming applications, communications applications, and multimedia applications have particularly benefited from increased processing power and clocking speeds. Indeed, once the province of dedicated, specialty workstations, many personal computing systems now have the capacity to receive, process and render multimedia objects (e.g., audio and video content). While the ability to display (receive, process and render) multimedia content has been around for a while, the ability for a standard computing system to support true multimedia editing applications is relatively new.
In an effort to satisfy this need, Microsoft Corporation introduced an innovative development system supporting advanced user-defined multimedia editing functions. An example of this architecture is described in U.S. Pat. No. 5,913,038, issued to Griffiths and commonly owned by the assignee of this document, the disclosure of which is expressly incorporated herein by reference.
In the '038 patent, Griffiths introduced an application program interface which, when exposed to higher-level development applications, enables a user to graphically construct a multimedia processing project by piecing together a collection of “filters” exposed by the interface. The interface described therein is referred to as a filter graph manager. The filter graph manager controls the data structure of the filter graph and the way that data moves through the filter graph. The filter graph manager provides a set of object model interfaces for communication between a filter graph and its application. Filters of a filter graph architecture implement one or more interfaces, each of which contains a predefined set of functions, called methods. Methods are called by an application program or other objects in order to communicate with the object exposing the interface. The application program can also call methods or interfaces exposed by the filter graph manager object.
Filter graphs work with data representing a variety of media (or non-media) data types, each type characterized by a data stream that is processed by the filter components comprising the filter graph. A filter positioned closer to the source of the data is referred to as an upstream filter, while those further down the processing chain is referred to as a downstream filter. For each data stream that the filter handles it exposes at least one virtual pin (i.e., distinguished from a physical pin such as one might find on an integrated circuit). A virtual pin can be implemented as an object that represents a point of connection for a unidirectional data stream on a filter. Input pins represent inputs and accept data into the filter, while output pins represent outputs and provide data to other filters. Each of the filters includes at least one memory buffer, and communication of the media stream between filters is often accomplished by a series of “copy” operations from one filter to another.
A filter graph can have a number of different types of filters, examples of which include source filters, decoder filters, transform filters, and render filters. A source filter is used to load data from some source, a decoder filter is used to decode or decompress a compressed data stream, a transform filter processes and passes data, and a render filter renders data to a hardware device or other locations (e.g., to a file, etc.).
FIG. 1 shows an exemplary filter graph 100 for rendering media content. Filter graph 100 comprises a number of different filters 104–110 and may or may not comprise a source 102. A typical filter graph for multimedia content can include, for example, of graph portion that is dedicated to processing video content and a graph portion that is dedicated to processing audio content. For example, in FIG. 1 a source 102 provides content that is typically in compressed form. A source filter 104 receives the content and then provides the content to one or more decoder filters for decompression. In this example, consider that filters 106–110 process video content, filters 106a–108a process sub-picture content (such as that used in Digital Video Data (DVD)), and filters 106b–110b process audio content. Accordingly, the decoder filters decompress the data and provide the data to a transform filter (e.g. filters 108–108b) that operates on the data in some way. The transform filters then provide the transformed data to a corresponding render filter (e.g. 110, 110b) that then renders the data.
Typically, an application program or application 112 provides a means by which a user can interact with the content that is processed by the filter graph. Responsive to a user interacting with the application, the application can issue commands to the source filter 104. Examples of commands can include Run, Stop, Fast Forward, Rewind, Jump and the like. The source filter receives the commands and then takes steps to ensure that the commands are executed at the right time. For example, the source filter 104 typically receives data and provides timestamps onto data samples that define when the data sample is to be rendered by the render filters. The source filter then hands the timestamped data sample off to the decoder for further processing. The render filters now know, because of the timestamp, when the data sample is to be rendered.
Now, when a user interacts with the various data streams via application 112, the user can typically alter the playback rate of the streams. For example, the user can fast forward the data streams and experience the streams at a faster playback rate. Altering the playback rate can typically take place via one of two ways. First, a global timing clock can be altered. This is referred to as a time compression or time expansion. Second, the application can instruct various filters to modify their output to make the data appear as if it was playing back at a different rate. This is referred to as a rate change. For example, if the user wishes to fast forward a data stream, the decoder filters can map the input timestamps of the individual data samples to different output timestamps so that the render filter renders the data streams at the requested playback rate.
As an example, consider FIG. 2. There, a graph 200 is provided. The x-axis is designated “Input Timestamp” and represents the input timestamp of a particular data sample. The y-axis is designated “Output Timestamp” and represents the output timestamp of a particular data sample. When a data sample is received for rendering, the source filter (such as source filter 104 in FIG. 1) provides the data sample with a timestamp that indicates when the data sample is to be rendered. The source filter then provides the data sample to the decoder filter (such as decoder filters 106–106b). Now assume that the user, through the application, indicates that the following should occur:                For the data samples with input timestamps of 1–10, they wish to have the samples rendered at a normal 1—1 play rate;        For the data samples with the input timestamps of 11–20, they wish to have the samples rendered at 5 times the normal rate (i.e. fast forwarded at 5×).        
As part of the process that takes place, the decoder filters can adjust the timestamps for the relevant samples so that the samples' output timestamps now comport with the desired playback speeds (i.e. play at 1—1 rate and fast forward at 5×). For example, in order to render the data samples that originally had timestamps of 11–20 (10 timestamps in total) at 5 times the playback rate, those samples will need to be rendered as if they had timestamps of 11 and 12 (i.e. 2 timestamps in total).
So, with this in mind, consider again FIG. 2. For input timestamps of 1–10 there is a one-to-one correspondence between input and output timestamps, meaning that the data samples will be rendered at a normal play rate. Input timestamps of 11–20 will, however, be mapped to output timestamps of 11 and 12 because of the 5× fast forward play rate. Thus, when the render filters receive the data samples with the re-mapped timestamps, the data samples will be rendered in accordance with the desired playback speeds.
Now, in reality, the re-mapping of timestamps can lead to synchronization problems in the following way. Consider, for example, that the individual decoder filters can have different computational models. That is, the different decoder filters might be provided from different vendors. Accordingly, the different computational models may perform computations for purposes of re-mapping time stamps differently. Specifically, the computational models may perform rounding operations differently. Because of this, the re-mapped timestamps can vary as between data samples that should for all practical purposes be rendered together. This can manifest itself in some different ways. For example, the audio that accompanies the video may lag just enough to be annoying. Additionally, sub-pictures such as video overlays may be overlaid at the wrong time. Thus, the user experience can be degraded.
Products utilizing the filter graph have been well received in the market as it has opened the door to multimedia editing using otherwise standard computing systems. Yet, there continues to be a need to improve filter graph technology and further enhance the user experience, or at least not degrade it.
Accordingly, this invention arose out of concerns associated with providing improved methods and systems for synchronizing timestamped data streams and, in particular, timestamped data streams associated with filter graphs.