Multimedia involves communicating information over a combination of different media, where the term media refers to the means by which the information is conveyed. Different types of media include, for example, audio, video, still images, animation, and text.
Computer based multimedia applications are now common place. In the not too distant past, however, multimedia applications were relatively uncommon due to the quantity of data involved, the speed and storage capacity limitations of computers and computer based telecommunication devices, and the bandwidth limitations associated with the network connections linking these devices. Today, nearly all personal computers have the capability to handle multimedia applications.
Recent advancements in computer and telecommunication technologies have led to the development and wide-spread use of new classes of computers and computer based telecommunication devices, and in particular, smaller, mobile (i.e., wireless) devices. These include, but are not limited to, laptop computers, hand-held computers, personal digital assistants (i.e., PDAs), and smart, web-enabled mobile telephones. Consequently, there is an increasing demand to design these new classes of computers and computer based telecommunication devices so that they, like most personal computers, are capable of handling multimedia applications.
Conversational multimedia is a type of multimedia service that allows two or more network devices to simultaneously execute a multimedia application, such as a video conferencing application or a still image sharing application, where the two or more network devices may include personal and/or portable computers, servers, telecommunication devices, or other like devices, and where the two or more network devices are connected to one another by one or more computer networks (e.g., wide area and/or local area networks). Generally speaking, the two or more network devices engaged in a conversational multimedia session must simultaneously access, manipulate, and exchange data stored in a multimedia database. Despite many recent technological advancements in the computer and telecommunication industry, there are may problems associated with providing effective conversational multimedia services.
A first problem associated with conversational multimedia is that each of the two or more network devices engaged in a multimedia session may have different terminal capabilities. For purposes of the present invention, “terminal capabilities” refer to the performance limitations associated with each of the two or more network devices that are engaged in the conversational multimedia services session. These performance limitations might include, for example, bandwidth limitations, bit error rate limitations, display screen size and resolution limitations, storage capacity limitations, and of course, processing power limitations. The reason why this is problematic is that one device may be able to effectively access and manipulate a certain multimedia object stored in the database, while another one or more of the devices may not be able to effectively access and manipulate the same multimedia object, due to performance limitations. For instance, user A who is associated with a first network device may want to manipulate a multimedia object (e.g., a still image) and, thereafter, transmit the manipulated object to user B, who is associated with a second network device. More specifically, user A may want to zoom to a particular region of interest (ROI) in the image, and then exchange the zoomed version of the image with user B. Alternatively, user A may want to crop a portion of the image, and exchange the cropped portion of the image with user B. User A, however, may be employing a personal computer that is capable of displaying an image that is 1280×1024 pixels, while user B is employing a hand-held computer that is only capable of displaying an image that is 88×104 pixels. If user A does not know in advance the terminal capabilities of user B and, as a result, fails to properly adapt the manipulated version of the image so that it is as compatible as possible with the terminal capabilities of user B, user A may successfully transmit the manipulated image to user B, but it is unlikely user B will be able to effectively access the manipulated image.
A second problem is that each of the two or more network devices may be subject to different network capabilities. It will be understood that each of the two or more network devices may receive and transmit multimedia data over a wide variety of different network connections, for example, computer network connections, telephone connections, integrated services digital network (ISDN) connections, asynchronous transfer mode (ATM) connections, and mobile network connections, where each is capable of supporting a different load capacity. Thus, if the network device employed by user A has a high-speed network connection while the network device employed by user B has a significantly lower-speed network connection, transferring multimedia information from the device associated with user A to the device associated with user B without properly adapting the information (e.g., applying an appropriate data compression scheme) may result in user B being unable to effectively access the information.
One possible solution for the above-identified problems is to store and maintain multiple versions of a given multimedia object in a multimedia database, where each version more suitably corresponds to a different combination of terminal and/or network capabilities. Unfortunately, this solution requires an excessive amount of storage capacity in order to store and maintain different versions of a multimedia object for each and every conceivable combination of terminal and/or network capabilities. Furthermore, the amount of time and processing power that would be required to individually manipulate each version makes this solution less than ideal.
Another possible solution is to store and maintain a single, adaptable version of a multimedia object. For example, JPEG2000 provides a standard coding scheme that permits images to be stored in a single, multi-resolution format. Therefore, a single version of an image can be down-scaled or up-scaled to satisfy the resolution requirement for each of several network devices. Accordingly, a network device that has a relatively high resolution capability has the ability access a high resolution version of the image, whereas a network device that has a relatively low resolution capability has the ability to access a low resolution version of the same image. While this solution alleviates the need to store a different version of the multimedia object for each and every conceivable level of resolution, it does not directly address the fact that the various network devices engaged in a conversational multimedia session are likely to exhibit other terminal and/or network capability differences. Consequently, this solution also fails to guarantee that each network device will be able to effectively access a multimedia object.
Yet another possible solution involves the use of transcoders. A transcoder accepts a data stream that is encoded in accordance with a first format and outputs the data stream encoded in accordance with a second format. In this solution, one version of a multimedia object, or a limited number of versions, is stored and maintained in a server. The data associated with the one, or the most appropriate version, is then converted by a transcoder located in the server, or located in a corresponding gateway, such that the converted version of the multimedia object is compatible with a particular combination of terminal and/or network capabilities and/or user preferences.
In general, the use of transcoders is well known to those of skill in the art. For example, it is known that a transcoder may be employed to convert an image from a first size to a second size. Thus, an image that is 4K×4K pixels may be stored in a server, though the network device that is to receive and/or gain access to the image is only capable of displaying an image that is 256×256 pixels. A transcoder may than be employed to convert, or transcode, the 4K×4K version of the image prior to making the image available to the receiving network device. This scenario is described in International Patent Application PCT/SE98/00448.
In another example, it is known that a transcoder may be employed to convert a video object from a first format (e.g., CIF) to a second format (e.g., QCIF), prior to making the video object available to the receiving device. This scenario is described in International Patent Application PCT/SE97/01766. It is also describe in Christopoulos et al., “Transcoder Architectures for Video Coding”, IEEE Transactions on Consumer Electronics, Vol. 44, pp. 88-98, February 1998.
In each of the solutions involving transcoders, there is an assumption that the transcoder is capable of deciding how the conversion of the multimedia object is to be implemented. However, this is not a correct assumption. In fact, there is simply no guarantee that a multimedia object which has been transcoded from one format to another will be delivered to or accessed by a given network device in an effective and meaningful manner.
Given the foregoing discussion, it is evident that there is a tremendous need to provide a conversational multimedia service that permits each of the two or more computer or computer based telecommunication devices to effectively manipulate, share and exchange multimedia objects stored in a multimedia database, despite the existence of different user preferences the fact that the one or more computer and/or computer based telecommunication devices may exhibit significantly different network and/or terminal capabilities.