Hypermedia, a term derived from hypertext, extends the notion of the hypertext link to include links among any set of multimedia objects, including sound, motion video, and virtual reality. It also connotes a higher level of user/network interactivity than the interactivity already implicit in hypertext.
Hypermedia is currently realized as a set of computer-addressable files that contain pointers for linking to multimedia information, such as text, graphics, video, or audio. The use of hypertext links is known as navigating. One of the emerging important media types are temporal multimedia objects, i.e. continuous or transient media like motion pictures, audio, morphing objects etc. adding a further information dimension: time.
These media types require certain computational, networking, and storage resources. MPEG (pronounced M-peg), which stands for Moving Picture Experts Group, is a family of standards used for coding audio-visual information in a digital compressed format.
MPEG-4 addresses coding of digital hybrids of natural and synthetic, aural and visual (A/V) information. The objective of this hybrid coding (SNHC) is to facilitate content-based manipulation, interoperability, and wider user access in the delivery of animated mixed media.
Trends in networking, in decentralization of media production and consumption, and in computer graphics point toward changes in distributing passive and interactive mixed media. Audio/video and 2D/3D synthetic graphics are merging into hybrid compositions in a variety of formats and platforms that extend the role of television and the PC. This evolution widely spans lower-bit-rate applications like video cellular telephony, and higher-bandwidth, networked, interactive, real-time media experiences like distance learning, gaming, and training.
There are two multimedia object categories: the temporal transient ones with timing constraints and the atemporal ones, i.e. the persistent objects. When these objects refer each other, i.e. are in relation, they are called hypermedia objects. The relation as well as the object and their properties are denoted using a hypermedia description (language).
Emerging silicon and software systems are moving toward delivery of hybrid content for real-time experiences with a high level of integration of computing resources, algorithms, and data primitives to decode, animate, render, and composite scenes. A/V objects can exist as transient or stored data in channels and media such as the Internet, ATM/BISDN communications, CD-ROM, on-line modifiable disks that page active data, archival digital libraries, and the memories of servers, decoders, PCs, graphics accelerators, and newer media processors.
Various modeling schema for spatial and temporal media content are embodied in current work such as VRML 2.0 (Virtual Reality Modeling Language), Java Media 2D,/3D, and ActiveX Animation. MPEG-4 is concerned with coding of animated data, and thus with spatial-temporal relationships among A/V objects as represented in bit-streams. The requirements of MPEG-4 are so complex that bit-streams and the higher-level representations they encode are designed in isolation from the application environment.
Several other cross platform video and audio standards have been established e.g. JPEG (Joint Photographic Experts Group), and a number of different MPEG standards.
On the other hand MHEG (Multimedia and Hypermedia Experts Group) is a multimedia presentation standard to provide a framework for multimedia applications, to define a digital final form for presentations, which may be used for exchange of the presentations between different machines or platforms, to provide extensibility.
MHEG defines the abstract syntax through which presentations can be structured. This is the definition of data structures and the fields in those data structures, through which two computers may communicate.
The MHEG model is object orientated, and defines a number of classes from which object instances are created when a presentation is designed. There are several classes, and these are used to describe the way video is displayed, audio is reproduced, and how the user can interact with the ongoing presentation. The relationship that is created between instances of these classes forms the structure of the presentation. There are several different types of class in the MHEG model, e.g. content classes or behavior classes, action classes, link classes, user input classes etc.
The separation of underlying techniques (due to their complexity) yields to an unfortunate separation of media description in a multimedia hypermedia document.
As in the case for coding, several other cross platform multimedia standards have been established e.g. the well known hypertext markup language (HTML) or meta descriptions like Standard Generalized Markup Language (SGML) or Extensible Markup Language (XML).
Linked content can be in different formats: text, HTML, images, video or audio, slides and many others. Content standards depend mostly on the plug-ins running on the user's browser, going images to complex media formats (mp3, wave, midi, Real Player).
Technically, the enrichment process does not affect the temporal media like video, since the link structure is described independently. At a conceptual level a hyper-video is the aggregation of a digital video and the linked informative structure. Technically it is realized by the original video decorated with (synchronized) links in a separate (enveloping) description.
Two specifications are vying to be the baseline protocol for multimedia exchange. The first is commonly known as MHEG (mentioned above), the second is DVB-MHP (digital video broadcast multimedia home platform).
ISO defines a family of MHEG standards, from MHEG-1 to MHEG-7, that allow multimedia objects to be distributed in a client-server architecture across a variety of platforms. MHEG-5 is a streamlined, application-specific version of MHEG-1 that embeds an MHEG boot application in the MPEG-2 stream. The boot application is a self-contained interpreting media object.
The DVB-MHP spec inserts an abstraction layer between applications and digital TV terminals. This allows applications to be carried over any compliant network, be it cable, terrestrial, or satellite, to a wide range of terminal types.
A typical DVD-MHP software architecture comprises MHP applications, called Xlets, are typically written in Java and compiled by the extensive range of Java classes defined in the MHP specification. The heart of the MHP is the application manager, which controls the full life cycle of Xlets, several of which can run concurrently.
A hypermedia communication system comprising a client computer, server computers for holding contents files, and a directory server computer for intensively managing information about the contents files is e.g. known from U.S. Pat. No. 5,884,301. These computers are connected via a network.
Current visual telecommunication applications provide on demand a streamed file exchange, i.e. a server provides a set of more or less unlinked temporal media objects, e.g. using uniform resource identifiers. A client can request and retrieve e.g. a streamed motion picture embedded in an environment that might decorate the motion picture by further uniform resource identifieres (URls).
Such a realization of a visual telecommunication application is described in European Patent Application No. 0 828 368 A1.
The problem to be solved is that for continuous temporal media objects like video within a hypermedia description it is not possible to refer, link, embed or relate to other hypermedia resources using the known techniques. This results in a morphological break and temporal media could not be treated as hypermedia.
This problem is targeted using a hypermedia description comprising expression means for a relation from a temporal hypermedia object to a referred hypermedia object.
The problem is solved by a method for a hypermedia communication system comprising the steps of                generating a hypermedia by presenting the hypermedia in a hypermedia description at a hypermedia server (e.g. based on a file or dynamically from external resources)        requesting the hypermedia at a hypermedia client        deploying the hypermedia description from the server to said client        presenting the hypermedia by translating the hypermedia descriptionwhere said hypermedia description comprising expression means for a reference from an atemporal hypermedia object to an other hypermedia objects, the hypermedia description comprising further expression means for a reference from a temporal hypermedia object to a hypermedia object.        
This problem is solved, inter alia, by a hypermedia communication system comprising a hypermedia server and a hypermedia client,                the hypermedia client comprises transmission means for requesting and receiving a hypermedia object from the hypermedia server,        the hypermedia server comprises transmission means for providing on request a hypermedia object to the hypermedia client, and        the hypermedia client comprises presentation means for presenting said multimedia object,        the hypermedia object comprises a temporal hypermedia object in relation to a referred hypermedia object, the relation being a reference from a temporal hypermedia object to a referred hypermedia object, and        the hypermedia client comprises interpretation means and interaction means for interpreting the relation for controlling the presentation and the transmission means.        
And the problem is solved by a hypermedia server comprising transmission means for providing on request a hypermedia object to a hypermedia client, the hypermedia object comprises a temporal hypermedia object in relation to a referred hypermedia object, the relation is a reference from a temporal hypermedia object to the referred hypermedia object, the hypermedia server comprising interpretation means for interpreting and resolving requests for the referred hypermedia object, retrieval means for retrieving the referred hypermedia object from a hypermedia server, and composition means for integrating or aggregating the referred hypermedia object into the hypermedia object.
The problem is solved correspondingly by a hypermedia client comprising transmission means for requesting and receiving a hypermedia object from a hypermedia server, and presentation means for presenting said multimedia object, the hypermedia object comprises a temporal hypermedia object in relation to a referred hypermedia object, the relation is a reference from the temporal hypermedia object to the referred hypermedia object, and the hypermedia client comprising interpretation means and interaction means for interpreting the relation for controlling the presentation and the transmission means.
And the problem is solved by computer software products for authoring, realizing a hypermedia server, and realizing a hypermedia client.
In other words, when concerning the hypermedia deployment process, server-side interaction initiated by a user action requires back-channel, i.e. a transfer protocol. It further requires a composite multimedia object for temporal and atemporal media and a transfer protocol therefor. The multimedia object relations provides enhanced inter-linking and networking interactivity.
The underlying idea is to reuse the hypertext media techniques, namely document object model, hypertext markup language, hyper text transfer protocol, web-servers and web-browser consequently for continuous transient temporal media like audio or video, in a transient continuous mode.
The underlying idea of the invention is an algebraic concept for describing (temporal) hypermedia. Algebraic hypermedia uses a set of basic operations on which to create a desired hypermedia (stream). The algebra consists of operations for temporally and spatially combining parts, and for attaching attributes to these parts. Parts of interest can be discovered with queries that describe desired attributes. Algebraic hypermedia permits hypermedia expressions to be nested in arbitrarily deep hierarchies. It also permits hypermedia parts to inherit attributes by context.
As digital video becomes ubiquitous and as more video sources become available, applications will need to deal with digital video as a new data type. However, the nature of video information, or in general of temporal media, is different from existing media types such as text, since video has both temporal and spatial dimensions. Moreover, the volume and unstructured format of digital video data make it difficult to manage, access and compose video segments into hypermedia documents.
Many existing digital video abstractions rely on the traditional view of video as a linear temporal medium. They do not take full advantage of either the logical structure of the video or of hierarchical relationships between video segments. Moreover, access based on the structure and its hierarchy is not supported.
An algebraic hypermedia data model enables to                introduce nested hypermedia structures such as shot or scene,        express temporal and spatial compositions of parts,        define output characteristics,        associate content information with logical parts,        provide multiple coexisting views and annotations of the some information,        provide associative access based on the content, structure and temporal information,        specify coordinated multi-stream viewing, and        specifying referential relations like hyper links or embeddings.        
The algebraic hypermedia model consists of (hierarchical) compositions of hypermedia expressions with semantic descriptions. The hypermedia expressions are constructed using algebra operations. The hypermedia algebra is a means for combining and expressing temporal or spatial relations, for defining the output characteristics of video expressions, and for associating descriptive information with these expressions. The algebraic abstraction provides an efficient means for organizing, accessing, and manipulating video data by assigning logical representations to the underlying video streams and their contents. The model also defines operations for access to the video information. The output characteristics of video expressions are media-independent, and thus the rendering can adjust to the available resources.
Users can search or navigate through video collections with either queries that describe desired attributes of hypermedia expressions or by exploring the hypermedia model via following relations (navigating). The result of such a query or an exploration might be a set of video expressions that can be played back, reused, or even manipulated by a user or a presentation client.
In addition to content-based access, algebraic video allows browsing. The user can explore the structure of the video expressions to understand the surrounding organization and context. The algebraic hypermedia model allows users and presentation client to compose concurrent video presentations by structuring parts and then describing the (temporal) relations between these segments. Hierarchical relations between the hypermedia expressions allow nested stratification, overlapping segments could be used to provide multiple coexisting views and annotations of the same data and enable the user to assign multiple meanings to the same footage. Parts can be organized so that their relationships are preserved and can be exploited by the user. In addition to simple stratification, the algebraic hypermedia model preserves nested relationships between strata and allows the user to explore the context of a stratum.
The algebraic video data model might provide the fundamental functions required to deal with digital video: composition e.g. bundling (a sheaf) in the topological sense, reuse, organization, searching, and browsing. It models complex, nested logical structure of hypermedia using hypermedia algebra. The hypermedia algebra is a useful metaphor for expressing temporal inter-dependencies between video segments, as well as associating descriptions and output characteristics with video segments. The model allows associative access based on the content of the video, its logical structure and temporal composition.
The fundamental entity of the algebraic hypermedia model is a presentation. A presentation is a multi-window spatial, temporal, and content combination of hypermedia parts. Presentations are described by hypermedia expressions.
The hypermedia algebra operations might be classified into the following categories:                Creation: defines the construction of hypermedia expressions.        Composition: defines temporal and spatial relationships between component part expressions.        Output: defines layout and audio output for hypermedia expressions.        Description: associates content attributes with a hypermedia expression.        
The algebra approach further allows to express hyper references, enhancing the normal media to hypermedia. The hyper media algebra defines a document architecture with a consistent interface for different media types and a transition model (behavior) between multimedia objects founded on content based links (hyper references) for atemporal media and for content and time-based dynamic links for temporal media, and intrinsic support for content based access.