With the increasing popularity of computers, and with the advent of mass networks such as the Internet, electronic distribution of content, such as documents, images, and sound, has become much more common. However, electronic content distribution has raised concerns for the creators of such content. For example, content creators wish to ensure that copyright and other authorship and ownership and usage information is attached to their electronic content, such that subsequent attempts to determine the author, owner and user of the content are successful. Furthermore, the creators wish to determine that their content has not been altered since first being distributed electronically.
More specifically, determining the identity of a copyright infringer or the identity of a violator of confidence is also increasingly important to intellectual property. When the identity of such a perpetrator is made, further damage to an organization's intellectual property can be stopped and restitution obtained by the victim. After copyright of electronic on content, in conventional systems the identity of the perpetrators and conspirators have remained difficult, if not impossible to determine.
Watermarking allows questions of ownership and use of a given piece of content—which may be widely distributed by virtue of the Internet, for example—to be resolved, by attempting to decode an embedded secret from the content. That is, by watermarking content data, the data owner can determine whether a suspect piece of content is his or hers by determining whether the watermark is present in the suspect data. Watermarking is a technique used to label digital content by hiding copyright or other information into the underlying data. Unlike encryption used to restrict access to data, watermarking can be employed to provide solid evidence of authorship and usage. Like data hiding generally, the watermark remains with the media through typical content-preserving manipulations, e.g. cropping, compression, and so forth. However, unlike data hiding generally, with watermarking an unauthorized user cannot access the embedded information (i.e., the watermark). In addition, the power of an effective watermarking system lies in its degree of robustness. Robustness ensures that the embedded watermark can not be removed or tampered with without destroying or at least degrading the quality of the information.
Watermarks have conventionally been either visible, thereby potentially distracting from the content itself, or invisible, thereby being necessarily diminished in strength so as to avoid perceptible artifacts.
Visible watermarks in electronic documents have been employed as a means of preventing misuse of intellectual property by providing notice to all viewers that intellectual property rights are claimed in the document. In conventional systems, visible watermarks have been injected only at the original document creation to indicate ownership, and the perpetrators of later illicit use, such as authorized users who copy the document and/or distribute the document to unauthorized users, remain undetected. Conventional systems have therefore not been effective in identifying violations of copyright and trade secret after the misuse.
Existing invisible watermarking techniques can embed digital data bit strings indicating ownership identification. A bit string is an ordered sequence of bits. However, information embedded with watermarks is often lost or unretrievable following operations such as printing, scanning, photocopying, and other such signal conversions. More particularly, such techniques are intended only for bitmapped images such as natural scenes, and are not applicable to document images where the degradation of quality is often excessive.
Ownership identification information is any information that describes a user, such as ownership and usage information, identification information, a process identification number (PID), type of authorized use or the number of copies permitted. User identification information includes the phone number, the name or the computer system logon identification name, or the social security number of the user.
For the reasons stated above, and for other reasons stated below which will be appreciated by those skilled in the art upon reading and understanding the present specification, there is a need in the art for tracing the users of documents. There is also a need for improved marking and extraction of marks from documents.