1. Field of the Invention
The present invention relates to the field of recorded digital media. More specifically, the present invention pertains to a method and apparatus for embedding hidden data or authenticity codes within digital data.
2. Background Art
Digital still cameras are growing in popularity as they become easier to use and as it becomes easier to archive the captured images on a computer system or memory device. However, the archived images are difficult to search and/or browse, especially when the database of captured images is large. In addition, the user may want information to be associated with each image, such as when and where the image was taken and the like. Currently, digital cameras either do not provide this type of information or have only limited capability to do so.
Digital media (such as digital videocassette recorders, digital video (or versatile) disc players, digital camcorders, digital still cameras and the like) make it possible to efficiently access, manipulate and reproduce digital data such as image data (including text) and video data as well as audio data (including speech). Digital media also make it possible to embed into the digital data supplementary information (e.g., hidden data or authenticity codes) that can be extracted automatically or in response to a user command. For example, the hidden data can be used to embed multiple speech streams into a video, each speech stream in a different language, so that the video can be distributed to a wide range of users who can then each view the video in the language of their choosing.
The hidden data are embedded directly into a selected subset of the pixels in a digital image frame; that is, the selected pixels are modified in order to store the hidden data. However, it is desirable that the embedded hidden data be undetectable by the human visual system. It is also desirable that the hidden data be easy to embed and retrieve, in particular in those cases where the amount of hidden data is relatively large. In addition, it is desirable that the hidden data be robust; that is, the hidden data should remain intact when the host data are compressed, stored, transmitted, manipulated, etc.
In some prior art data hiding schemes, a random sequence representing the data to be hidden is inserted into the digital data ("host data"); however, these schemes introduce a number of difficulties. First, because of its random nature, there are problems with detecting the hidden data. For example, the detection metrics have to be computed for each of the random sequences used in order to determine a statistically satisfactory match between any sequence detected in the host data and valid hidden data. However, for a reasonably large sequence, this computation can take a very long time to complete.
Second, in order for the random sequence to be robust against noise, it must be different enough from other valid hidden data. Accordingly, the distance between each new sequence and other sequences must be checked to make sure that the sequences are not too close to each other. If they are too close, a new sequence must be generated and tested again. Again, these computations can take a very long time to complete.
Another common prior art data hiding technique is to repeat the same sequence at several different locations in the host data. However, the resulting hidden data are not robust against noise. This is especially true when the number of repetitions is small.
Another significant disadvantage to the data hiding techniques described above and in other prior art data hiding techniques is that the host data are often significantly altered in order to facilitate retrieval of the hidden data. For example, the pixel value of the host data at each of the particular locations (e.g., a pixel) where the hidden data are being embedded needs to be increased or decreased by a significant percentage so that the hidden data will stand out. Consequently, surrounding pixels may also need to be changed so that the image is properly blended. This in turn limits the number and position of possible locations within the host data which can be altered without being detectable by the human visual system or without compromising the accuracy of the host data.
In summary, prior art techniques for embedding hidden data into digital data suffer from a number of disadvantages. The hidden data generated using the prior art techniques are not robust against noise and are difficult to embed and retrieve. The prior art techniques can require significant alteration of the host digital image data, and so the number and position of possible locations within the digital data for placing the hidden data are limited.