The present invention relates to the steganographic embedding of data in a series of digital signals, datastreams or measurements (hereinafter all often generically referred to as xe2x80x9cmeasurementsxe2x80x9d); being more specifically, though not exclusively concerned with such xe2x80x9cmeasurementsxe2x80x9d taken directly from an analog data stream, such as, for example, an audio waveform, or from subsampled and/or transformed digital data, as described in said parent U.S. applications Ser. No. 09/389,941 (Process, System, And Apparatus For Embedding Data In Compressed Audio, Image, Video And Other Media Files And The Like), and Ser. No. 09/389,942 (Process Of And System For Seamlessly Embedding Executable Program Code Into Media File Formats Such As MP3 And The Like For Execution By Digital Media Players And Viewing Systems) filed Sep. 3, 1999; the present application containing modified and supplementary material illustrating the generic concepts underlying the basic techniques of said applications.
In some aspects, this application is also useful to incorporate techniques described also in U.S. application Ser. No. 09/518,875 filed Mar. 6, 2000, for Method, Apparatus and System For Data Embedding In Digital Telephone Signals And The Like, And in Particular Cellular Phone Systems, Without Affecting The Backwards Compatibility Of The Digital Phone Signal.
As explained in said parent applications, data has heretofore often been embedded in analog representations of media information and formats. This has been extensively used, for example, in television and radio applications as for the transmission of supplemental data, such as text; but the techniques used are not generally capable of transmitting high bit rates of digital data.
Watermarking data has also been embedded so as to be robust to degradation and manipulation of the media. Typical watermarking techniques rely on gross characteristics of the signal being preserved through common types of transformations applied to a media file. These techniques are again limited to fairly low bit rates. Good bit rates on audio watermarking techniques are, indeed, only around a few dozen bits of data encoded per second.
While data has been embedded in the low-bit of the single-domain of digital media enabling use of high bit rates, such data is either uncompressed, or capable of only relatively low compression rates. Many modern compressed file formats, moreover, do not use such signal-domain representations and are thus unsuited to the use of this technique. Additionally, this technique tends to introduce audible noise when used to encode data in sound files.
Among prior patents illustrative of such and related techniques and uses are U.S. Pat. No. 4,379,947 (dealing with the transmitting of data simultaneously with audio); U.S. Pat. No. 5,185,800 (using bit allocation for transformed digital audio broadcasting signals with adaptive quantization based on psychoauditive criteria ), U.S. Pat. No. 5,687,236 (steganographic techniques); U.S. Pat. No. 5,710,834 (code signals conveyed through graphic images); U.S. Pat. No. 5,832,119 (controlling systems by control signals embedded in empirical data), U.S. Pat. No. 5,850,481 (embedded documents, but not for arbitrary data or computer code); U.S. Pat. No. 5,889,868 (digital watermarks in digital data); and U.S. Pat. No. 5,893,067 (echo data hiding in audio signals).
Prior publications relating to such techniques include
Bender, W. D. Gruhl, M. Morimoto, and A. Lu, xe2x80x9cTechniques for data hidingxe2x80x9d, IBM Systems Journal, Vol. 35, Nos. 3 and 4, 1996, p. 313-336;
A survey of techniques for multimedia data labeling, and particularly for copyright labeling using watermark in the encoding low bit-rate information is presented by Langelaar, G. C. et al. in xe2x80x9cCopy Protection For Multimedia Data based on Labeling Techniquesxe2x80x9d (http://www-it.et.tudelft.nl/html/research/smash/public/benlx96/benelux_cr.html).
In specific connection with the above-cited xe2x80x9cMPEG Specxe2x80x9d and xe2x80x9cID3v2 Specxe2x80x9d reference applications, we have disclosed in the above-mentioned parent application Ser. No. 09/389,942, techniques applying novel embedding concepts directed specifically to imbuing one or more of pre-prepared audio, video, still image, 3-D or other generally uncompressed media formats with an extended capability to supplement their pre-prepared presentations with added graphic, interactive and/or e-commerce content presentations at the digital media playback apparatus.
The before-mentioned other parent application Ser. No. 09/389,941 is more broadly concerned with data embedding in compressed formats, and with encoding a frequency representation of the data, typically through a Fourier Transform, Discrete Cosine Transform, Wavelet Transform or other well-known function. The invention embeds high-rate data in compressed digital representations of the media, including through modifying the low-bits of the coefficients of the frequency representation of the compressed data, thereby enabling additional benefits of fast encoding and decoding, because the coefficients of the compressed media can be directly transformed without a lengthy additional decompression/compression process. Such technique also can be used in combination with watermarking, but with the watermark applied before the data encoding process.
The earlier cited Langelaar et al publication, in turn, references and discusses the following additional prior art publications:
J. Zhao, E. Koch: xe2x80x9cEmbedding Robust Labels into Images for Copyright Protectionxe2x80x9d, Proceedings of the International Congress on Intellectual Property Rights for Specialized Information, Knowledge and New Technologies, Vienna, Austria, August 1995;
E. Koch, J. Zhao: xe2x80x9cTowards Robust and Hidden Image Copyright Labelingxe2x80x9d, Proceedings IEEE Workshop on Nonlinear Signal and Image Processing, Neos Marmaras, June, 1995; and
F. M. Boland, J. J. K O Ruanaidh, C, Dautzenberg: xe2x80x9cWatermarking Digital Images for Copyright Protectionxe2x80x9d, Proceedings of the 5th International Conference on Image Processing and its Applications, No. 410, Endinburgh, July, 1995
An additional article by Langelaar also discloses earlier labeling of MPEG compressed video formats:
G. C Langelaar, R. L. Lagendijk, J. Biemond: xe2x80x9cReal-time Labeling Methods for MPEG Compressed Video,xe2x80x9d 18th Symposium on Information Theory in the Benelux, 15-16 May 1997, Veldhoven, The Netherlands.
These Zhao and Koch, Boland et al and Langelaar et al disclosures, while teaching encoding technique approaches having partial similitude to components of the techniques employed by the present invention, as will now be more fully explained, are not, however, either anticipatory of, or actually adapted for solving the total problems with the desired advantages that are addressed and sought by the present invention.
Considering, first, the approach of Zhao and Koch, above-referenced, they embed a signal in an image by using JPEG-based techniques. (Digital Compression and Coding of Continuous-tone Still Images, Part 1: Requirements and guidelines, ISO/IEC DIS 10918-1.) They first encode a signal in the ordering of the size of three coefficients, chosen from the middle frequency range of the coefficients in an 8-block or octet DCT. They divide eight permutations of the ordering relationship among these three coefficients into three groups: one encoding a xe2x80x981xe2x80x99 bit (HML, MHL, and HHL), one encoding a xe2x80x980xe2x80x99 bit (MLH, LMH, and LLH), and a third group encoding xe2x80x9cno dataxe2x80x9d (HLM, LHM, and MMM). They have also extended this technique to the watermarking of video data. While their technique is robust and resilent to modifications, they cannot, however, encode large quantities of data, since they can only modify blocks where the data is already close to the data being encoded; otherwise, they must modify the coefficients to encode xe2x80x9cno dataxe2x80x9d. They must also severely modify the data since they must change large-scale ordering relationships of coefficients. As will later more fully be explained, these are disadvantages overcome by the present invention through its technique of encoding data by changing only a single bit in a coefficient.
As for Boland, Ruanaid, and Dautzenberg, they use a technique of generating the DCT Walsh Transform, or Wavelet Transform of an image, and then adding one to a selected coefficient to encode a xe2x80x9c1xe2x80x9d bit, or subtracting one from a selected coefficient to encode a xe2x80x9c0xe2x80x9d bit. This technique, although at first blush somewhat superficially similar in one aspect of one component of the present invention, has the very significant limitation, obviated by the present invention, that information can only be extracted by comparing the encoded image with the original image. This means that a watermarked and a non-watermarked copy of any media file must be sent simultaneously for the watermarking to work. This is a rather severe limitation, overcome in the present invention by the novel incorporating of the use of the least-significant bit encoding technique.
Such least-significant bit encoding broadly has, however, been earlier proposed; but not as implemented in the present invention. The Langelaar, Langendijk, and Biemond publication, for example, teaches a technique which encodes data in MPEG video streams by modifying the least significant bit of a variable-length code (VLC) representing DCT coefficients. Langelaar et al""s encoding keeps the length of the file constant by allowing the replacement of only those VLC values which can be replaced by another value of the same length and which have a magnitude difference of one. The encoding simply traverses the file and modifies all suitable VLC values. Drawbacks of their techniques, however, are that suitable VLC values are relatively rare (167 per second in a 1.4 Mbit/sec video file, thus allowing only 167 bits to be encoded in 1.4 million bits of information).
In comparison, the technique of the present invention as applied for video, removes such limitation and can achieve much higher bit-rates while keeping file-length constant, by allowing a group or set of nearby coefficients to be modified together. This also allows for much higher quantities of information to be stored without perceptual impact because it allows for psycho-perceptual models to determine the choice of coefficients to be modified.
The improved techniques of the present invention, indeed, unlike the prior art, allow for the encoding of digital information into an audio, image, or video file at rates several orders of magnitude higher than those previously described in the literature (order of 300 bits per second and much higher, above 800 bits per second ). As will later be disclosed, the present invention, indeed, has easily embedded a 10,000 bit/second data stream in a 128,000 bit/second audio file.
In the prior art, only relatively short sequences of data have been embedded into the media file, typically encoding simple copyright or ownership information. Our techniques allow for media files to contain entirely new classes of content, such as: entire computer programs, multimedia annotations, or lengthy supplemental communications. As described in said copending application, computer programs embedded in media files allow for expanded integrated transactional media of all kinds, including merchandising, interactive content, interactive and traditional advertising, polls, e-commerce solicitations such as CD or concert ticket purchases, and fully reactive content such as games and interactive music videos which react to the user""s mouse motions and are synced to the beat of the music. This enables point of purchase sales integrated with the music on such software and hardware platforms as the television, portable devices like the Sony Walkman, the Nintendo Game Boy, and portable MP3 players such as the Rio and Nomad and the like. This invention even creates new business models. For example, instead of a record company trying to stop the copying of its songs, it might instead encourage the free and opened distribution of the music, so that the embedded advertising and e-commerce messages are spread to the largest possible audience and potential customers.
The present application, moreover, is specifically concerned with the high-bandwidth steganography feature described in our parent applications, as above discussed, for embedding (and recovering) data in a series of digital signals or measurements. These measurements, as earlier stated, may be taken directly from an analog data stream, such as an audio waveform, or they may be taken from subsampled and/or transformed digital data and the like. The key requirement of these techniques is that there be aliasing and/or quantization present in the conversion process, wherein the introduced aliasing and/or quantization is modulated or modified so as to embed substantial data without drastically affecting the quality of the digital signals or measurements.
Where, as also earlier mentioned, previous techniques, such as embedding data directly in the least-significant bit of a digital measurement, were capable of high bandwidth steganography, they did this at the cost of introducing large amounts of high-frequency noise into the data. Where the data is embedded at a rate of I data bit per N samples, they introduce noise into the data of the order of 1/N. With the techniques of said parent applications and herein, the amount of noise introduced into the data is greatly reduced for the same amount of data embedded; in particular, reduction of introduced noise to from an order of 1/N (with the previous least-significant bit techniques) to an order of 1/N2, for typical data distributions. This means that much higher density of data can be introduced without perceptible change, than was possible by using previous techniques.
Examples of applications where the invention is particularly advantageous for embedding data in signals or measurements, include:
Audio waveform measurements (such as the PCM algorithm used in CDs, or compressed audio files such as in the earlier discussed MP3);
Image value measurements (such as a scanned image, a fax, or a compressed image file such as jpeg);
Time-varying image value measurements (such as a digitized movie, or compressed video files such as mpeg); and
Any other type of data consisting of a series of physical measurements (temperature or pressure readings, machine control data, process monitoring, etc.)
The ability afforded by this invention to add a high-bandwidth digital channel inside an existing media format, without changing the media format, is, indeed, of widespread utility and applicability. For example, we have used it in such applications as adding advertising, interactive content, games, and software downloads to existing media content, as described in our said parent patent applications.
It is accordingly a primary object of the present invention to provide new and improved high-bandwidth steganographic techniques as disclosed in our said parent applications, for embedding supplemental data in the series of digital signals or measurements taken, for example, directly from an analog data stream such as an audio waveform and the like, or from subsampled and/or transformed digital data and the like, wherein the input data is quantized and/or aliased and the supplemental data bits are successively embedded therein, preferably by novel least-significant-bit parity encoding techniques, for attaining the above-described advantages afforded by these novel techniques.
Other and further objects will be explained hereinafter and are more particularly delineated in the appended claims.
In summary however, from one of its broader or generic aspects, the invention embraces the method of steganographically embedding substantial supplemental digital data in a series of digital measurements derived from one of an analog data stream and subsampled and/or transformed digital data, that comprises, deriving such series of digital measurements through functional transformation from a set of input data converted into a set of output data of successive quantized and/or aliased components, transforming the supplemental digital data into a series of successive bits; and introducing the successive bits into the quantized and/or aliased components to modulate successive components through but slight adjustments of the same, thereby to embed the supplemental data in the series of digital measurements without substantially affecting the quality thereof Best mode and preferred embodiments, techniques and designs for implementing the invention are hereinafter explained in detail.
The invention will now be described in connection with the accompanying drawings, FIG. 1-10 of which are identical with those presented in our said parent application Ser. No. 09/389 941 and illustrate the following;
FIG. 1 is a block and flow diagram illustrating an overview of the data encoding process and system, operating in accordance with a preferred embodiment of the invention;
FIG. 2 is a similar diagram presenting an overview of the decoding of the media file embedded with the data of FIG. 1 as playback by the media player or viewer;
FIG. 3 is a view similar to FIG. 1 showing the use of the previously (and later) discussed steganographic techniques in the encoding process;
FIG. 4 illustrates the use of the before-mentioned digital watermarking processes with the encoding process of the invention;
FIG. 5 is an exemplary signal waveform and Fourier transformation-based compressed coefficient-based representation of the signal for use in the coefficient-domain parity encoding process useful with the invention;
FIG. 6 is a more detailed block and flow diagram specifically directed to a steganographic encoding of audio data, compressed through transformation into a coefficient domain and embedded with data and digitally watermarked in accordance with the process of the invention;
FIGS. 7 and 8 are similar to FIG. 6 but are directed respectively to encoding data in an image and in a video file, again compressed by transformation of the respective image and video data into coefficient domain;
FIG. 9 is a similar diagram applied to the encoding of data in a 2-D or 3-D spline of data points; and
FIG. 10 is directed to the encoding of the data in volumetric data files.
FIG. 11 illustrates the same system overview as FIG. 3 (and FIG. 1), using the more generic terms xe2x80x9cmeasurementsxe2x80x9d instead of xe2x80x9cmedia filexe2x80x9d and xe2x80x9ctransformed measurements with embedded dataxe2x80x9d, instead of xe2x80x9cmedia file with embedded executable codexe2x80x9d;
FIGS. 12 and 13 similarly more generically illustrate the system of FIG. 2 for xe2x80x9cData Extractionxe2x80x9d, rather than xe2x80x9cData Encoding Playbackxe2x80x9d;
FIG. 14 is identical to FIG. 4 showing the use of watermarking with the encoding technique invention, but again using more generic labeling of xe2x80x9cmeasurementsxe2x80x9d rather than xe2x80x9cmedia filexe2x80x9d;
FIGS. 15 and 16 track FIG. 3 (and FIGS. 6-8) using the more generic labels xe2x80x9cmeasurementsxe2x80x9d and xe2x80x9cquantizationxe2x80x9d and xe2x80x9caliasingxe2x80x9d of the xe2x80x9ctransformationxe2x80x9d, respectively, in the data encoding process, and the generic title term xe2x80x9cmodulationxe2x80x9d for xe2x80x9cmodifyingxe2x80x9d in FIG. 3
FIG. 17 is a flow diagram of the parity encoding discussed in said parent application and further detailed herein;
FIG. 18 illustrates the use of variable rate data encoding in accordance with the principles of the invention;
FIG. 19 is a system and operational diagram of a black box encoding modification;
FIGS. 20 and 21 illustrate the of the to the techniques of the invention applied to pre-computation for dynamic embedding; and
FIG. 22 illustrates the parity decoding.