1. Field of the Invention
The invention relates to software that employs plugin components. More particularly it relates to such software used to manage, modify, and reproduce media resource data from files, streaming sources, and other sources.
2. Background
Digital audio player software application manages audio data from various sources such as serial, file, and streaming data via networks, disk storage, optical storage, etc. and compressed (or uncompressed) according to various different protocols. The application also outputs such files to various different output devices, such as visual output software or hardware, hardware digital-audio converters (DACs), files, broadcast systems, etc. To provide flexibility, the application is designed to accept plugin software components to provide for feature upgrades. For example, a plugin might be added to permit audio data to be visualized during simultaneous playback in a way that is similar to a visual light show that might be employed at a dance hall or concert. The visual output might be applied to a projector, a computer screen, or an external light array. Plugins, in addition to permitting output to varying devices, may also allow modification of sound files such as alternative file formats, resampling, sound effects and distortions, etc. Such modifications, of course, may be inherent in certain types of output conversions as well.
The invention may be used in the environment of digital audio players. Currently, media players employ plugin technology that allows features to be upgraded using modular components that can selectively replace existing components or add new features to an existing program. Plugin technology is usually associated with the Internet, for example it is used to provide web browsers with the new capabilities, such as for reproducing media files, viewing 3-D interactive media, playing sound and video files, performing calculations, etc.
One example of a program that uses plugin technology is a streaming media player that runs on PCs. Features may be added by installing new plugin components. For example, the ability to view animations that respond to music could be added to a streaming media player application that reproduces sound from compressed audio files. Another way that such an application could be enhanced is by adding a plugin that is capable of reading a new source file format.
Referring to FIG. 1, an example architecture for a streaming or file media player is Winamp by Nullsoft, Inc. The system hosts plugins for audio decoding 120 which may be any of various different formats such as MP3 120. In this architecture, the system accepts a request from a user interface for a particular file and based upon the extension (e.g., WAV, MP3, etc.) it passes it to a plugin that has been registered to handle that format. The plugin then reads the file from the resource and decodes it to generate the output audio (e.g., pulse code modulated) stream to an output device 130. This arrangement requires that the system designer know in advance the capabilities of each plugin. The structure is monolithic and therefore does not allow reading, file format handling, and decoding functions to be upgraded without a new monolithic plugin.
Referring to FIG. 2, a more flexible architecture is exemplified by DirectShow(copyright) by Microsoft(copyright). In this architecture, the host system selects a file reader plugin based on the requirements of the resource selected by the user. This is done dynamically through a file selection user-interface (UI) that permits the user to select a file to be played. The selection is based on the match between the selected resource and the available reader plugins. Two examples are shown, an Internet file reader 200 for http files and a local file reader 205 for local (e.g., disk) files. The output of the selected file reader (the local file 205, in this case) is connected to the input of an appropriate decoder. Here again, one of a variety of plugins may be available such as for DIS audio format files 215, for MP3 files, and for WAV files. The output of the selected decoder is then applied to the output device.
The selection of decoder in the above system occurs as follows. A convention is established whereby each decoder provides to the system a bitmask the system then uses to test the data stream. A predefined amount of data is applied to the bitmask and if the result is a particular predefined sequence (e.g., all xe2x80x9c1xe2x80x9d s or all xe2x80x9c0 xe2x80x9d s), the plugin is accepted to decode the data. As a result, if there are any errors in the file or substantial misalignment between the data stream and bitmask, the test will fail.
Note that in both of the above architectures, there may be more than one output in the resulting stream. For example, a decoder plugin could produce audio and video streams.
The architecture of FIG. 2 overcomes the inflexibility of that of FIG. 1, but is incapable of handling files packaged in any form other than one that can be decoded directly into an audio stream by a monolithic decoder plugin. Such file formats may even have been used to embed multiple files in a single wrapper such as a ZIP file and such an architecture has no mechanism for dealing with such cases. Also, the host system must make decisions as to which inputs and outputs to connect. A plugin may offer different features from those the host system has been programmed to recognize, such as an ability to handle an unknown file format, the ability to correct errors in certain types of files, the ability to generate outputs that the host does not recognize, etc. This limits ability of the host to take advantage of plugin""s capabilities without changes to the host system and its ability to manage media files embedded in non-media-file formatting or encrypted.
Another issue that arises in connection with plugin architectures is security. For example, a user may be authorized to play a file, but not authorized to reproduce it. Or the user may be allowed to reproduce it, but only in a certain format. These rights may arise in connection with a prior payment or simply by virtue of the media type. In prior art systems, security is administered by the host system. This requires the host recognize when a security situation exists with a plugin that is going to be used and responding to it. This also limits the variations on the options available when new plugins offer new ranges of features.
In any plugin architecture, it is necessary to authenticate components. This kind of security is intended to insure that unauthorized things do not happen such as the introduction of viruses. The technology for authenticating plugin components is mature and outside the scope of this document. In the area of media reproduction, modification, and playback, there is an entire realm of uses that media authors and vendors would permit, if plugin components were restricted to interacting in only predefined ways. For example, a user could pay a very low price for an audio file if the seller were assured that the file could only be played once. A consumer could have a tremendous library of music instantly at his disposal, paying only for the use of the volumes in the library. This kind of scheme might be readily implementable in a native application. But in an application that admits all manner of plugin components it presents a formidable problem. How does the native application insure that plugin components will not behave in an unauthorized way? One way is to fall back on a gate-keeping function of the host system, which gave rise to the authentication systems that are in use today. The user or the application either permits a plugin to operate in the application or not. That is, the developer provides mechanics in the player that will either accept or reject a plugin based on the identity and authentication of the plugin. A common example of this is web browser plugins. Often users are queried as to whether the plugin should be accepted or not. The issue of whether the user has rights to distort, reproduce, play, transmit, etc. medial files (images, sound, movies, pictures, etc.) is a function of what the developer has provided either natively in the application and what plugins are permitted to connect with the application. Whether a plugin has a role in the system to distort sound, convert sound to images, modify images, reproduce images, add sound to video, etc. is a function of the native and plugin elements making up the application. What plugins are resident is a result of the gate-keeping function and that is about the extent of how such issues are managed.
To allow a user to select various resources, the simple model of FIG. 1 must be augmented. The user might want resources files located on Internet sites, the user""s computer drive, etc. The player should thus have some kind of selection device to allow the user to see the selections available and indicate the file to be played. To provide this flexibility, the system may provide different readers, each for a different type of source device (e.g., Internet file, disk file, etc.) In addition to reading the files, each file may contain text and other kinds of data. For example, a typical MP3 file will contain the title and other data associated with the audio that is packaged in it. The files may also contain multiple resources. All of these variations (multiple resource files, additional non-audio information, etc.) depend on the file format. While the source device may be independent of the file format, the compression protocol is usually tied to the file format. Most file formats are linked with a particular encoding format, for example Sound Interface Design (SID) files. In such cases, the special file handling required for extracting multiple resources from a combined file would, as a matter of normal design choice, be handled by the decoder specific to the format of the chosen file. The configuration of FIG. 2 addresses these issues to some extent. However, the invention involves some fundamental departures. These are described below.
A flexible plugin architecture provides some over-arching features that permit a system to upgrade its capabilities without being limited by the host system""s familiarity with them. The first component is that plugins are programmed to provide a parameter in response to a role proposed for the plugin to play in the host system. The parameter reflects the plugin""s competence in handling the role. For example, a file reader can return a parameter that reflects its ability to handle a particular file format.
In the preferred embodiment of the invention, the parameter has two prongs: (1) an accuracy rating that results from testing the plugin on the task and (2) a figure of merit that is permanently associated with the plugin and indicative of the plugin""s quality of performance irrespective of the particular task. Metaphorically, this arrangement permits the host system to ask for volunteers from among the various plugins. The figure of merit may be assigned to each plugin and obtained simply by querying the plugin or stored by the host when the plugin is registered with the host. The accuracy rating may be generated by testing the plugin. The host system, when it has an output to connect, such as a path to a file or an unknown data type emanating from a plugin, may apply the data to the plugin and observe the response. For example several leading bytes of a data stream may be passed to the plugin which the plugin may use to determine its ability to deal with the format. In a preferred embodiment, each candidate plugin is queried as to what it needs to evaluate the data source. For example, one plugin may request the first eight bytes and another, simply the first two bits of the data stream. The host system may supply the requested data and the plugin, in response, will generate the figure of merit by testing the data stream in an appropriate fashion such as by applying a digital mask or attempting to xe2x80x9cplayxe2x80x9d the file until an error free output is identified.
One of the reasons the above volunteer-system is superior to the embodiment of FIG. 2 is that a file format handler or decoder plugin can accept any amount of data its programming permits it to inspect. For example, an MP3 decoder could attempt to locate a start header that is far from the expected location at the beginning of the file. If the architecture of FIG. 2 were used, the file would be disaffirmed by the plugin because it failed to match the specified bitmask. Also, plugins may upgrade their own robustness in connection with handling files with unusual formatting such as included data, bad data, offsets, etc. Here the plugin tells the system what it needs to evaluate the file and then tells the system whether it can handle the file or not.
Another aspect of the flexibility of the system is an inherent structure that separates the role of file format handler plugins from the decoders. This allows the unpacking of compressed, bundled, encrypted, or otherwise formatted files or groups of files. By disconnecting the function of handling a file format from the step of decoding, resources embedded in various file formats can be managed by the flexible system. This flexibility may be exploited by further building into the host system a recursive structure permitting multiple layers of formatting to be managed.
There are two levels of recursion that may be used. In the first, the format handler function is permitted to be performed iteratively. This makes it possible to unpack a file that has more than one layer of file formatting. In this kind of host system, a plugin volunteers to handle a raw data stream generated by a file reader. The system generates an instance of the file volunteered format handler and identifies any outputs existing in the instantiation of the format handler or generated by the instantiation (depending on whether the plugin was requested to generate an output or not). If an output is generated and required to be tied further to another format handler instantiation, that is, if the output data is raw data again rather than media data, another plugin is requested to volunteer and an further instance of a format handler plugin is generated and the output connected to it. The system then takes stock of all the outputs generated (one plugin can have more than one output) and responsively to the command, identifies an appropriate input for these output either by finding a file handler plugin, if it is raw data or a decoder if it is media data.
A second level of recursion places the decoders and file format handler plugins at the same point in the recursion loop. This permits an output of a decoder to be looped back and tied to a further instance of a format handler or further instance of a decoder. For example audio data might be sent to a downsampler before being written to a file, or to a visual display for a sound-activated light show, or could be sent to a signal analzyer, etc.
The recursive architecture may be generalized as follows. A plugin from a pool of plugin types is selected using a volunteer process whereby a plugin requests data from an output of an upstream process. The system supplies the requested data and a quality rating is generated by the plugin. The system uses the quality rating to select a plugin; the volunteer. An instance of the volunteer is generated. Any downstream outputs generated after applying the upstream process (i.e., the plugin possesses or generates in response to the data and system parameters a given number of outputs), are placed in a list of outputs to be connected. The generation of each output is an event that triggers the volunteer process for associating a plugin input with an upstream output. The system does this iteratively, or recursively, until all outputs have been tied. If an output is not to be used, it may be tied to a null plugin or host process.
Note that the flexible architecture approach, as will be recognized by persons skilled in the art, may be used in a variety of software systems aside from media management applications. For example, file conversion utilities, Internet browsers, signal analysis instruments, business application software, and any application or operating system program that uses plugins or interchangeable plugin-like components can make use of features of the invention.
In a plugin type architecture, as discussed above, it may be advantageous for the host system to query each handler/decoder to identify one that will xe2x80x9cvolunteerxe2x80x9d to process the file. When a multiple-resource file is selected by the user in the first instance, e.g., a DIS file, the handler must invoke some operation to obtain a selection from the multiple-resource set before decoding the data in real time. This architecture has some shortcomings from both the user interface standpoint and from a software efficiency standpoint, especially in the context of an architecture that permits multiple plugins. First, the user cannot view all resources transparently. To address this need a separate operation is provided that examines the contents of files, data sources such as data ports or Internet connections, and generates a data file containing a list of resources. In the process of examining the content of each file or set of files, the system will employ the recursive architecture described above and can generate a map that indicates the path to each resource. This map may be stored along with the resource so that when the user selects a resource through a user-interface, the chain of plugins required to unpack and decode the resource can be generated (connected end-to-end as quickly as possible. Alternatively, the system can search for the file as it unpacks the file or set of files. In such a case it may correlate only the resource with a particular file from a set or even simply store the title of the resource. Note that we have assumed the resource contains an identifier that can be displayed through a user interface. This is not always the case and the system could generate a unique tag for a resource that does not contain an identifier or generate one from a portion of its contents (e.g., borrow the first line of a text file and use it as a name).
In the resource-selection UI, resources are correlated with appropriate files. The robust recursive architecture described above handles multiple-resource files that may be packed under layers of encryption, compression, etc. The architecture assumes this is the norm rather than the exception and with the assumption that any number of layers of file compression or packaging might exist. Prior art approaches may provide a solution to handle such bundled resources, but their structures are more monolithic and provide little flexibility for upgrade of features.
The user interface is built around a resource-based selection paradigm rather than a file-based selection paradigm. The user selects, not a file, but a specific resource. The resource does not have to be resident on the local system and can be merely a pointer to a resource. If the user selects a file to add to the user""s resource list, the system examines the file and updates a single resource list from which the user can then select the desired resource.
The resource selection UI contains a list of all resources. The files in which they reside may be available to the user, but are not the primary basis for selecting a resource. The user selects a resource to be played and the system invokes the appropriate data source, channels it through the appropriate reader and into the appropriate decoder for playing. When a new file is identified to the system, the file is preferably examined by all resident file handlers to determine the type of file it contains using the volunteer-system recursive architecture described above. The process is iterative and results in each final output being mapped to an identifier and a data structure that indicates how to get to the file (i.e. through which plugins"" inputs and outputs, which files at each level of unpacking, and in what order) to arrive at the final resource. A playlist generated by the resource selection UI also stores the secondary data relating to the resources including the structure that indicates how to unpack and decode the resource.
The architecture is much more friendly to plugin systems because the decoder function is handled separately from the file handling. By providing the iterative framework, multiple embedded formats can be processed. The Playlist may store all information to process and read the file without permanently expanding or separating the embedded file. So, a file found to contain multiple DIS files and ZIP files all embedded in a single ARJ file could be handled without permanently unpacking either the outer layer or any of the inner layers. (Of course, these may be temporarily unpacked for the purpose of unpacking the desired resource and this is a matter of various design considerations peculiar to the particular application.)
Such a plugin architecture inevitably raises security issues because of the risk of malicious software components that might damage user""s systems by introducing viruses and that might be used to make unauthorized conversions of data that violate the copyrights attaching to that audio data. In addition, a flexible plugin architecture that provides many layers in the conversion process, between source and output, in turn provides numerous taps for unauthorized use of the data.
In a highly flexible plugin architecture such as described above, because it permits so many possible relationships between plugin components, the issue of security is particularly complex. A solution to this is an alternative approach to plugin security. Suppose that plugins and the native application could create independent bilateral or multilateral contracts between themselves without relying on the host system to administer them. Add to that the feature of permitting outside media suppliers to negotiate terms in the same kinds of contracts. The host application may generate a list of terms that could be incorporated in such contracts and the authentication system used to insure that the right xe2x80x9cpartiesxe2x80x9d are interacting according to the terms of the contract. Or the host system could simply allow the plugins to indicate whether they will accept an input from a particular source or will permit output to a particular sink. The invention includes an implementation of this basic idea.
The following is an example of a plugin-administered security contract. A user of an audio player wants to install a plugin that generates a visual image that can play in synch with audio files. The application permits the plugin to be installed. The plugin must then negotiate with the application on a resource by resource basis to determine if a given resource stream may be applied to the plugin. In the instant case, suppose that the resource is a stream containing musical data. The musical data file itself may contain an identifier. Through a contract with the application developer, the audio file owner can mark (e.g., watermark) his file with an authentication data that indicates to the application a unique identity for the source or class of the file. This authentication data tells the application that this file is to be accorded a certain class of treatment. The class of treatment may be specified by a list of allowed processes to which the file""s owner has agreed to permit its contents to be subjected to. The plugin, in turn, has authentication data and an indicator of the kind of data that it accepts. So, in this example, where the plugin generates a synchronous video display, the plugin might request as its input, 8-bit lossy data, which can be crude music data or a downsampled version of a higher bandwidth music file. Such data would not necessarily sound appealing, but it can be used to drive a sophisticated synchronized video display to good effect. The above data now establishes the information necessary for the application to make a decision as to whether to permit the resource data to be applied to the plugin. The class of treatment may specify that there are no restrictions on the application of a downsampled version of the data file. (A reason for having such a loose provision with regard to lossy data is that the media owner""s concern is with broadcast, reproduction, or other use of its material and these concerns are not as significant when the owner can be assured that the full file will not be accessible to the plugin.) The application can then make the decision to go ahead and apply the lossy data to the plugin. An alternative implementation of the same idea is to permit the contract not between the resource owner (via the resource file) and the video plugin xe2x80x9cconsumer,xe2x80x9d but between the resource owner and a plugin channel that takes in full music file data and generates 8-bit lossy data for output. In this case, the latter plugin downsampler would have to authenticate itself to insure that the media owner can regard it as trustworthy. Of course the contract between the media owner and the downsampler plugin in this example is enforced by the application. It is also assumed that the application, aside from being enforcement-level arbiter, would also provide the framework for these contracts because the application must understand the possible array of terms that it must enforce.
From a technical standpoint, all the pieces of the above interaction puzzle are well established. The contract terms can be embedded in a digital watermark or explicitly written into a header of the file. Plugin components can contain contract as well. One plugin can agree to be connected to accept data or apply to data only to certain other plugin components and such connections can be further dependent on the type of data to be exchanged.
For another application example, consider an audio player application containing (1) a plugin that saves data in a particular compression format, (2) a plugin that acts as a front end to a DAC, and (3) a plugin that pitch bends the real-time audio stream. Also consider that the user of the player has contingent access to (1) a video file, (2) a professional music file, and (3) an amateur music file. The video file might contain restrictions that it will not allow itself to be applied to the data compression plugin unless certain rights are purchased. Those rights could be written into the video file by the server used to deliver it through a secure transaction. The detection of the presence of those purchased rights (reproduction rights) would enable the contract enforcer, the application, to apply the data to the compression plugin. The copying privilege in this is not universal. Application to the compression plugin is dependent on the authentication of the plugin. In this case, the media owner might approve a particular authentic plugin because it knows that the copy permission is not reproduced by the compression operation. Thus, the reproduced file does not contain the copy permission and the user cannot thereby extend copy privileges to others. In fact, in this case, he can only make one compressed copy. In another variation, the amateur music file may contain no contract restrictions. The video file may contain no restrictions with regard to application to the authorized authenticated pitch bender because the pitch bender is known to filer out essential synchronization information rendering the video file unusable as a video data stream.
The professional music file owner may permit its data to be applied to the authorized authenticated DAC because it knows that it will be reproduced only ephemerally. The privilege may have been extended, as in a previous example, through a payment and a secure transaction through a network server. Many different combinations are possible.
It is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of the preferred embodiments of the present invention only, and are presented in the cause of providing what is believed to be the most useful and readily understood description of the principles and conceptual aspects of the invention. In this regard, no attempt is made to show structural details of the invention in more detail than is necessary for a fundamental understanding of the invention, the description taken with the drawings making apparent to those skilled in the art how the several forms of the invention may be embodied in practice.
The invention will be described in connection with certain preferred embodiments, with reference to the following illustrative figures so that it may be more fully understood. With reference to the figures, it is stressed that the particulars shown are by way of example and, for purposes of illustrative discussion of the preferred embodiments of the present invention only, and are presented in the cause of providing what is believed to be the most useful and readily understood description of the principles and conceptual aspects of the invention. In this regard, no attempt is made to show structural details of the invention in more detail than is necessary for a fundamental understanding of the invention, the description taken with the drawings making apparent to those skilled in the art how the several forms of the invention may be embodied in practice.