Since the early 1990's, as the processing power of computers has increased, so has their ability to produce and present high-quality visual and audio-visual content in the form of games, applications, animations, presentations and the like. With the Internet explosion of the late 1990's, the ability of computers to share this content over a network and present it online has become increasingly important. Yet the tools for authoring, viewing, publishing and sharing of this content have evolved from disparate sources and specialized needs, and bring with them individual legacies and collective incompatibilities. Furthermore, the evolution of these tools has failed to keep pace with the growing mass market of computer users. In particular, many of the authoring tools were initially designed for professional use, and attempts to “dumb them down” for non-professional users have lead to mixed results.
In general, multimedia tools typically provide certain basic authoring and control features. These include capabilities for creating and/or importing media objects that contain media data (which can include text data, image data, video data, animated graphics data, sound data and other data representative of visual and/or audio information), editing media data, editing display settings that determine how media data is displayed, programming playback behavior that controls how media data is presented, organizing and structuring objects, and outputting finished pieces of multimedia content that can then be stored on Internet-enabled servers and linked together via the World Wide Web. All of these features can vary from system to system.
In general, the features of an authoring system are implemented according to the expected skill level of the user of the system. For example, an author may want to specify the circumstances under which a given media object starts and stops playing, its on-screen size and location, time-based or event-driven changes to its size and location, and various other playback behaviors appropriate to the object. The means by which this is done needs to be tailored to the expected skill level of the author. Therefore, the features of an authoring system tend to vary greatly depending on how sophisticated the author is assumed to be. Typically, some of the most significant variations among system features tend to occur in features which allow playback behavior to be specified by authors.
In systems where the author is assumed to be sophisticated, a great deal of power and control is generally provided for specifying playback behavior, but at a price in terms of difficulty and complexity. For example, a professional Web developer may need to create a cutting-edge, science-fiction-style user interface for a major entertainment website, in which extremely detailed control over the playback behavior of media objects is needed to provide the expected degree of activity and responsiveness. The prevailing method for achieving this level of control involves the use of scripting, exemplified by Javascript used in conjunction with HTML in Web applications and by proprietary scripting languages built into many high end authoring systems (such as Macromedia Flash® MX).
Scripting methods operate at their most basic level by associating a list of instructions, called a script, with a triggering event. The possible triggering events for scripts vary, and can include real-time input from the user interface, timers and clocks, trigger points within media object timelines, other scripts, etc. When the triggering event for a script occurs, a script playback engine, called an interpreter, reads the instructions in the script and performs the actions and procedures specified therein. In a multimedia system, these actions and procedures typically include not only the features customarily found in any programming environment, such as subroutines, conditional branches, general purpose variables, math operations, etc., but they also include facilities for controlling and manipulating the playback of media objects.
Regardless of the specific instructions supported by a particular scripting method, current methods of scripting are often unsatisfactory due to the procedural nature of scripting. Especially when dealing with highly interactive multimedia presentations with numerous possible states and paths, the step-by-step nature of the script tends to obscure the true structure of the content, and requires substantial mental visualization, i.e. sophistication, on the part of the author to be used effectively. As a result, current methods of scripting have significant drawbacks in multimedia authoring systems where ease of use for unsophisticated authors is needed.
One drawback is that the procedures used to create the scripts are abstract and fragmented, and do not correspond either directly or intuitively to the shape and structure of the resulting multimedia presentation. This problem becomes worse as the size and complexity of the multimedia presentation increases. To create a set of scripts that represents a highly interactive or expansive presentation, a large number of non-intuitive, interrelated procedures must be performed, often presenting a complex or frustrating burden to authors, particularly those who are inexperienced.
Another drawback is that current methods of scripting frequently result in rigid or too simplistic algorithms for determining the activity in a presentation. This tendency often leads to mechanical-looking results and misses the objective of creating appealing and entertaining multimedia content. For authors attempting to create multimedia content that is attractive or interesting to a viewing audience, this presents a burden in delivering appropriate results.
Therefore, current methods which use scripting to enable the author to specify the playback behavior of media objects are generally unable to provide the clarity, intuitiveness, and ease of use required to enable unsophisticated authors to produce high quality multimedia content.
In some authoring systems, a primary goal is ease of use for unsophisticated authors. For example, in a system which allows consumers to present a slide-show of family photographs on a personal website, only minimal control over the playback behavior of media objects is needed (i.e., selection of the order of the photographs), but the process of specifying that control must be simple enough to be easily understood by the average consumer. One method for enabling the author to specify the playback behavior of media objects where the manner of specifying is simple and easy to understand involves the use of templates.
Template methods in this field operate at their most basic level by associating to individual media objects or sets of media objects with specified roles or events within a template. A template is a piece of multimedia content in which the playback behavior of the media objects has been programmed in advance, but in which some or all of the media objects have been left unspecified. Once the author has assigned one or more media objects to their desired roles or events within the template, the system incorporates the specified media objects into the template and outputs the combination as a finished piece of multimedia content.
Regardless of the specific methods used to associate media objects with roles or events within a template, current methods that use templates are frequently unsatisfactory due to the pre-programmed nature of the template, and because the author generally has no input into the structure of the template or the functionality it provides. Particularly with regard to multimedia presentations where individuality or uniqueness is desirable, an author's creative intentions are frequently unable to be realized using templates. As a result, current methods using templates have significant drawbacks where flexible or detailed control over the playback of media objects is desired.
One drawback with current methods using templates is that the procedures undertaken to develop templates cannot take into account each individual author's needs and tastes, but must rather make broad assumptions about authors in general. This limits the ability of templates to produce results which satisfy a variety of author's intentions and preferences.
Another drawback is that current methods using templates typically result in media object behavior that is generic or overly simplistic. This tendency often leads to multimedia content which is “canned” looking or otherwise unsophisticated. For authors that wish to create unique or sophisticated content, this presents a burden in delivering appropriate results.
Therefore, current systems and methods using templates to enable the author to specify the playback behavior of media objects are generally unable to provide the flexibility and author-specific control needed for satisfactory authoring of custom multimedia content.
In some authoring systems, the passage of time is used as an organizing principle for controlling the playback behavior of media objects. This would include systems designed to allow a consumer to shoot several minutes of home video and edit it into a one minute “movie” (an example being Apple Computer's® iMovie®); this would also include systems designed to allow a professional graphic artist to create frame-based to animations and animated graphics (an example being Macromedia Flash® MX). In general, such systems provide the ability to take a number of separate media elements, such as text items, images, video segments, sound effects, musical clips, etc., and organize them in time. One way to provide this ability this is through the use of timelines.
Timeline methods in this field operate at their most basic level by providing means for the author to arrange media objects along a line which is divided into frames, each frame representing a discrete, constant unit of time passing. The “frame rate” of a timeline specifies the length of this constant unit; for example, a frame rate of ten frames per second means that each unit is a tenth of a second long. The timeline may further be divided into independent data paths or “layers” which allow the author to develop the behavior of several media objects in parallel. For example, one layer could be used for a background object, and separate layers might be used for each of several foreground objects. Additional features may be provided for organizing timelines into interconnected “scenes”, creating looping timelines, nesting timelines within timelines, and exporting the results as completed “movies.” Typically, finished movies are played back using a player application or Web browser plug-in. The application or plug-in interprets the movie data on a frame-by-frame basis and displays it to the end user.
Regardless of the specific methods used for defining playback behavior in time, current methods that use timelines are frequently unsatisfactory due to the inherently linear nature of representing object behavior along a timeline. In situations where the content being produced needs to be interactive for the end user, the behavior may involve many branching and converging paths, as well as conditions which may last for indefinite periods of time. Structures such as these are not easy to represent within the linear format of timelines. As a result, current methods using timelines have significant drawbacks when interactive or non-linear behavior for media objects is desired.
One drawback with current methods using timelines is that in order to achieve interactive playback, the timeline must be broken into segments, each of which represents part of the presentation. These segments, while still presented to the author as sequential and linear, may in fact be controlled by various jump, loop and branch commands, so that in fact they do not correspond either directly or intuitively to the shape and structure of the resulting multimedia presentation. This problem becomes worse as the level of interactivity of the presentation increases, often presenting a complex or frustrating burden to authors, particularly those who are inexperienced.
Another drawback is that timelines, even those which have been broken into individually controllable segments, typically produce multimedia content that has a repetitive or “looping” feel. This is because with timelines it is difficult to define several independently occurring processes, each with their own sense of time, that overlap and interact in unpredictable and varied ways. This limitation presents a substantial burden in creating multimedia content with enough variety and interest to merit extended or repeated viewings.
Some tools providers have attempted to address the problems of timeline methods by combining them with other methods, such as scripting and templates. In systems that combine timelines and scripting methods (such as Macromedia Flash® MX), the scripting has the same drawbacks as discussed above. In systems that combine timelines and template methods, the templates are typically canned sequences produced by professionals that can be laid out, interconnected and adjusted somewhat by the author. For example, the author may be able to specify some color schemes, fonts and custom text that appears in the animations. Examples of this approach are found at the website Moonfruit.com. But timeline templates present similar problems to the templates already discussed. They are inflexible, limit the creative input of the user and often result in a canned look.
Therefore, current systems and methods using timelines to enable the author to specify the playback behavior of media objects are generally unable to provide the flexibility and variety needed for satisfactory authoring of non-linear multimedia content.
The techniques described in the preceding discussion for enabling authors to specify playback behavior represent the most common techniques currently in use in the field. There is a broad variety of existing systems available to professional authors and consumers which employ these techniques, and these systems also include various capabilities for handling other aspects of content creation as mentioned above (such as creating and importing media objects, editing media data, editing display settings that determine how media data is displayed, organizing and structuring objects, outputting finished pieces of multimedia content, etc.). A representative overview of these systems is now provided.
The category of tools designed for use by professional authors includes the following systems. Macromedia Flash® MX and Adobe LiveMotion™ are professional authoring and animation tools based on timelines and used for creating animations and interactive objects for Web pages and other presentations. Macromedia Dreamweaver® is a professional Web authoring tool used for creating and updating websites with a variety of text, picture and animation objects. Adobe Photoshop® is a professional art and digital photo editing tool which allows users to create and edit digital images. Tribeworks iShell™ and Quark Systems mTropolis® are professional multimedia authoring tools used to make presentations for CD-ROMs, kiosks and other location-based applications.
The category of tools designed for use by consumers includes the following systems. Roxio (formerly MGI) Photosuite® is a digital photo editing tool used to acquire and touch up digital photos, add effects and text and save them as standard picture files. Logitech QuickCam® is a consumer tool (most recent Windows version is ImageStudio™) for capturing and editing digital photos and videos. Roxio (formerly MGI) Videowave™ is a consumer video editing tool used to acquire digital videos, edit them and add effects, and save the results as video files. Yahoo PageBuilder™ is a web-based consumer Web authoring tool which allows users to create Web pages online and add various animated effects. (Macromedia) Shockwave PhotoJam® allows users to create animated slide presentations using digital pictures and a number of built-in effects. Microsoft PowerPoint® is a slide presentation tool that allows the user to create sequential slide shows of text, pictures, movies and colored backgrounds. Moonfruit Sitemaker™ is a web-based consumer authoring tool (available online at www.moonfruit.com) used for creating and updating websites with a variety of text, picture and animation objects. SWiSHzone.com SWiSH™ is a consumer authoring tool (available online at www.swishzone.com) used for creating Flash animations.
In addition, the following systems, while not strictly focused on authoring, include features which relate to aspects of the present invention. Groove Networks Groove® is a peer-to-peer collaboration tool with basic visual authoring features allowing users to create and edit different types of documents collaboratively online. Jazz™ is a zooming user interface (ZUI) system under development at the University of Maryland, based on an earlier system, Pad++, developed at NYU and University of Mexico, which presents data objects as existing in a virtual space with infinite resolution that provides new ways to organize and view data. An application in the Jazz system is the CounterPoint™ plug-in for Microsoft PowerPoint2000® which allows slides in a slide show presentation to be arranged on a two dimensional surface.
Regardless of which systems and methods are used during the authoring stage, the final stage of multimedia content creation typically involves outputting a finished piece of multimedia content and distributing it. This frequently involves exporting the content to a computer file which can then be stored on an Internet-enabled server and linked with other files via the World Wide Web. For example, someone using a web authoring tool to create a personal home page will typically output the page as a browser-readable HTML file and store it on a Web server, at which point the author's friends and family can view the page in their browsers and can put links to it in their own home pages.
This linking method provided by the World Wide Web allows separate pieces of multimedia content located on different computers and authored by different people to be connected together by including within one piece of multimedia content a reference to the Web address of another. A broad variety of authoring systems used to create multimedia content depend on this linking method of the World Wide Web to interconnect pieces of multimedia content stored on different computers.
The linking method provided by the World Wide Web operates at its most basic level by defining a browser-readable format for Web content called HTML (which may be extended by Javascript and downloadable browser “plug-ins” that read and play custom formats). Authors can associate a triggering event in one piece of Web content, such as the user clicking on a particular line of text displayed in the browser, with the Web address of a second piece of Web content. When the triggering event occurs, the browser discards the current piece of Web content and loads and displays the second piece of Web content, effecting a page change. The address specified in the link identifies the desired Web content in terms of the Web server on which it is stored, using either the Internet Protocol (IP) address of the server or a registered domain name (such as www.amazon.com) that is converted into the IP address by a Domain Name Server (DNS).
Linked pages created and posted to Web servers are viewed in a browser, which is a playback program that allows people to traverse, or “navigate,” through the distributed content page by page. It should be noted that this user “navigation” through distributed content is different from user “exploration” of files on the hard drive, such as that provided by Microsoft® Windows®, which involves traversing through a hierarchical file system (HFS) typically presented using a desktop metaphor. With exploration, the current location is defined within the context of the hierarchy, and means are provided to explore up or down in the hierarchy by opening nested containers called folders and presenting their contents. With navigation on the other hand, location is not defined within the context of a hierarchy. Rather, the content of a file or object being examined is displayed directly, and information is maintained about the file or object that was previously examined. A “back” function is provided to return to the previous file or object, and a “forward” function may be provided to return to the most recent content that a “back” function has been executed from.
The method used by Web browsers to effect page changes is conceptually simplistic, involving first discarding the current page and then loading and displaying the next. This approach often results in problems with the quality of the “Web surfing” experience provided to users of the browser. Particularly when it comes to effective presentation of link destinations, and the smoothness of transitions between Web pages, the linking method of the World Wide Web and its implementation by Web browsers can have significant drawbacks.
The main problem with the linking method of the World Wide Web is that the procedures a Web browser uses to access data for a new page do not begin until the triggering event for the link has occurred. This means that after clicking on a link, the user must wait for the data of the new page to load before seeing the new page, which often takes a substantial amount of time. Furthermore, when using a browser, the user cannot preview the actual destination of the link prior to committing to the page transition, but can only evaluate the destination based on information provided in the current page, which is frequently inadequate. As a result, users frequently become frustrated after triggering links which are slow to load, irrelevant to the purposes at hand, or both. This presents a substantial burden in providing an efficient and pleasing “Web surfing” experience for the user.
Therefore, current authoring and playback systems which depend on the linking method of the World Wide Web to provide linking of content across different computers inherit the drawbacks of that method, which include the inability to effectively represent link destinations to users and the inability to implement page changes smoothly.
As discussed above, methods currently in use for authoring multimedia content and for linking and viewing distributed multimedia content present a variety of drawbacks to the user. In authoring, what is needed is a level of control like that provided by scripting, without the complexity, loss of perspective and mechanical results inherent in scripting methods. What is further needed is a simplicity and ease of use like that provided by templates, without the restrictions imposed by template methods. What is also needed is the ability to control time-based processes without the inflexibility imposed by timelines. In linking and viewing distributed content, what is needed is a manner of connecting and presenting distributed content that provides for effective display of navigation destinations to the user and allows for a smooth and pleasing navigation experience for the user.
Finally, when taken as a whole, current systems for authoring, distributing and viewing multimedia content present a collective burden. Each only addresses a specific part of a much larger picture, resulting in incompatibilities and inefficiencies as content is passed from one tool to the next along the authoring, distribution and viewing chain. Therefore, what is further needed is a unified approach to authoring, distribution and viewing that eliminates these inefficiencies and incompatibilities.