1. Technical Field
The present invention relates generally to the authentication of multimedia data streams. More particularly, the present invention relates to the insertion of authentication data across media channels of a multimedia data stream.
2. Discussion
For as long as humans have communicated with one another, there has been significant concern over maintaining confidentiality. As a result, verbal, written, as well as electronic messages have historically been the subject of substantial technological efforts to maintain security. It is easy to understand that as the complexity of electronic messages (i.e. files/data streams) increases, so do the techniques for authenticating these messages. For example, electronic messages can have visual content (such as images or streams of images), audio content (such as .wav sound file data), textual content (such as word processing data), or any combination thereof. In recent years various authentication techniques for each of these “single” media have been explored. With the advent of multimedia, however, authentication concerns have continued. In some instances, these concerns have increased.
Multimedia data streams contain two or more media channels such as visual media channels, audio media channels, and text media channels. While certain data hiding techniques have been developed for media authentication when at least two forms of media are presented in a digital data stream, these techniques fail to fully address the type of data being authenticated.
Generally speaking, conventional digital data hiding schemes can be classified into two categories—robust data hiding and fragile data hiding. Robust data hiding provides a mechanism for fighting against common signal processing (or unintentional) attacks, as well as intentional attacks. This is done by making the hidden data immune to variations caused by signal processing or transmission errors.
Fragile data hiding, on the other hand, provides a mechanism for detecting variations made in the host medium such that the hidden data can manifest the originality of the host medium. In some applications, it can also be beneficial to identify the location of the variation in the host medium. In short, fragile data hiding may suffer from common signal processing attacks as well as intentional attacks. Thus, fragile data hiding is more commonly used for authentication purposes.
As already mentioned, media specific techniques for digital media such as digital color/gray scale images, plain text, and video have been studied by a number of researchers in recent years. Specifically, a number of approaches to fragile image watermarking have been developed. Furthermore, authentication of other media types such as video and text, with fragile data hiding, has often been considered to be a special case of image data hiding and studied accordingly. It is well known that one of the most important formats of media in e-distribution is video, which itself is a multimedia data stream containing visual, audio, and text channels. When video authentication is limited to a particular domain of image authentication, however, the strength and capability may be greatly limited or insufficient for certain applications. In light of the above, it is desirable to provide a fragile data hiding system for the purposes of multimedia authentication.
The above and other objectives are provided by a method for hiding authentication data within a multimedia data stream in accordance with the present invention. The multimedia data stream has at least two media channels. The method includes the step of obtaining a first set of authentication data, where the first set of authentication data is based on data contained in the first media channel. The method further includes the step of hiding the first set of authentication data in the second media channel. As already mentioned, one of the most important formats of media e-distribution is video, which itself is a multimedia data that contains visual, audio, as well as text data. Data hiding in all possible channels, as opposed to single medium/single channel hiding, yields higher data hiding capacity. More data hiding capacity also provides a higher level of controllability.
The benefits associated with using all possible media channels have two particularly important applications. The first application relates to the ability to optimize data hiding capacity based on data structures. For example, it is commonly known that visual data has much larger (several orders of magnitude) data hiding capacity than audio data. This is due to the human auditory system's incapability to tolerate (i.e., high sensitivity) additive random noise. On the other hand, plain text, which is often viewed as binary image data (i.e., visual medium) as well, is the most difficult type of media in which to embed hidden data. The low capacity of perceptual invisible noise induced by its binary nature makes it particularly difficult to insert any hidden data within a text channel. To reach a better transparency and capacity tradeoff, the unique unbalanced multimedia data structure is well utilized by inserting part or all of the authentication value obtained from the low capacity media channel and other necessary control data into the high capacity media channel, such as the visual data channel. For ease of discussion, video image frames will be referred to as visual data, and plain text data will be referred to as text data, unless otherwise specified.
A second important application relates to the ability to synchronize between channels. Specifically, hiding data in audio and visual channels as well as in text channels in synchronization can provide additional authentication capabilities. For example, the present invention allows for the determination of whether an audio track is a “fake” version of the video. It is also important to note that the reverse is true. The present invention, therefore, provides a solution to the cross verification problem. In other words, this enables the verification of whether a channel originated with the presumed channels (i.e. that it is an authentic channel of the original data).
It is important to note that such an efficient utilization of multimedia data structure for data hiding and the capability of cross verification are particularly useful when dealing with active data streams. For example, the present invention provides a method for hiding an active data stream within a multimedia stream having an audio channel and a visual channel. The method includes the step of hiding a first subset of the active data stream in the visual channel. The method further provides hiding a second subset of the active data stream in the audio channel. In one embodiment, the first subset includes executable content, where the second subset includes a controlled data stream.
It is to be understood that both the foregoing general description and the following detailed description are merely exemplary of the invention, and are intended to provide an overview or framework for understanding the nature and character of the invention as it is claimed. The accompanying drawings are included to provide a further understanding of the invention, and are incorporated in and constitute part of this specification. The drawings illustrate various features and embodiments of the invention, and together with the description serve to explain the principles and operation of the invention.