The present invention describes a method of estimation and recovering of general affine geometrical transformations, and is extensible to any other defined class of geometrical transforms. The main applications of the invention are robust digital still image/video watermarking, document authentication, and detection of periodical or hidden patterns; In the case of periodical watermarks, the watermark can also be predistorted before embedding based on a key to defeat block-by-block removal attack. These applications of the invention are based on common methodology that assumes the existence of a periodical or known regular structure (both visible or perceptual unperceived) in the body of the visual document. Following the estimation of these structures, a determination of the undergone image transformations can be performed based on this estimation.
In watermarking applications the perceptually invisible periodical pattern of possibly encrypted and encoded data is embedded in the structure of the visual document for copyright protection, document authentication or tamperproofing. In the other applications the invisible pattern contains the necessary information about user/owner, index, ID number, relative coordinates and so on. The invisible pattern is further used for the detection of undergone geometrical transformations, for document indexing, or generally for recognition using estimated parameters of the embedded patterns. Therefore, the main challenging practical problem consists in the robust detection and estimation of the parameters of the hidden patterns, that is the subject of the proposed approach.
The state-of-art methods capable to estimate and recover undergone geometrical transformations can be divided into several groups depending on the reference structure used and on the method applied to estimate the parameters of the affine or other class of geometrical transform. We will mostly concentrate our review on the still image watermarking application of the proposed approach. Obviously, the approach is easily applicable to the rest of the above mentioned tasks with minimal modifications in the basic method structure.
Digital image watermarking has emerged as an important tool for author copyright protection and document authentication. A number of methods (see [2] for a detailed review) were proposed since the first publications on this subject in 1994. In e.g. [3] important issues of watermarking system robustness were pointed out. One of the most important question in the practical application of digital image watermarking is robustness against geometrical attacks such as rotation, scaling, cropping, translation, change of aspect ratio and shearing. All these attacks can be uniquely described using the apparatus of affine transformations [4,5].
An affine transformation can be represented by the 4 coefficients a,b,c,d composing the matrix A for the linear part, plus the coefficients vx,vy of the translation vector {right arrow over (v)}:                               A          =                      (                                                            a                                                  b                                                                              c                                                  d                                                      )                          ,                                   ⁢                              v            ->                    =                      (                                                                                v                    x                                                                                                                    v                    y                                                                        )                                              (        G1        )            Therefore, an affine transformation maps each point of cartesian coordinates (x, y)T to (x′, y′)T, according to the formula:                               (                                                                      x                  ′                                                                                                      y                  ′                                                              )                =                              A            ·                          (                                                                    x                                                                                        y                                                              )                                +                      v            ->                                              (        G2        )            where “.” is the matrix product, and “+” the vector sum. However, since translations can be easily and independently determined, for example based on cross-correlation with some embedded reference information, in the following we will only consider the linear part A.
The successive combination of n affine transforms Ai, i=1 . . . n yields another affine transform, and can be expressed as:A=An·An−1· . . . A1  (G3)(ignoring the translation components). Below are given examples of simple affine transforms:                a resealing of factor s is represented by:                     S        =                  (                                                    s                                            0                                                                    0                                            s                                              )                                    (        G4        )                    a change of aspect-ratio of factors sx,sy, which is equivalent to different rescalings along the x- and the y-axis, is:                               S          ′                =                  (                                                                      s                  x                                                            0                                                                    0                                                              s                  y                                                              )                                    (        G5        )                    a rotation of angle θ is:                     R        =                  (                                                                      cos                  ⁢                                                                           ⁢                  θ                                                                              sin                  ⁢                                                                           ⁢                  θ                                                                                                                          -                    sin                                    ⁢                                                                           ⁢                  θ                                                                              cos                  ⁢                                                                           ⁢                  θ                                                              )                                    (        G6        )                    a shearing along the x- and the y-axis of factors s′x, s′y is:                               S          h                =                  (                                                    1                                                              s                  x                  ′                                                                                                      s                  y                  ′                                                            1                                              )                                    (        G7        )                    horizontal and vertical flipping are also affine transforms, respectively given by:                                                                                                               F                    H                                    =                                      (                                                                                                                        -                            1                                                                                                    0                                                                                                                      0                                                                          1                                                                                      )                                                  ,                            ⁢                                                                                                                       F                V                            =                              (                                                                            1                                                              0                                                                                                  0                                                                                      -                        1                                                                                            )                                                                        (        G8        )                    
Depending on the reference features used the existing methods for recovering from affine transformations can be divided into 3 main groups: methods using a transform invariant domain [4], methods based on an additional template [5], methods exploiting the self-reference principle based on an auto-correlation function (ACF) [6,7] or magnitude spectrum (MS) of the periodical watermarks [8].
The transform invariant domain approach entirely alleviates the need for estimating the affine transformation. It consists in the application of the Fourier-Mellin transform to the magnitude cover image spectrum. Watermarking in the invariant domain consists in the modulation of the invariant coefficients using some specific kind of modulation. The inverse mapping is computed in the opposite order. However, the above approach is mathematically very elegant but it suffers from several drawbacks. First, the logarithmic sampling of the log-polar map must be adequately handled to overcome interpolation errors and provide sufficient accuracy. Therefore, the image size should be sufficiently large, typically not less than a minimum of about 500×500 pixels. Additionally, this approach is unable to recover changes of the aspect ratio; to handle such aspect ratio change, a log-log mapping can be used. It is however impossible to simultaneously recover from rotation and resealing (accomplished by the log-polar mapping) and from an aspect ratio change (which requires a log-log mapping).
To overcome the problem of poor image quality due to the direct and inverse Fourier-Mellin transform and associated interpolation errors, the template approach might be used. The template itself does not contain any payload information and is only used to recover from geometrical transformations. Early methods have applied a log-polar or a log-log transformation to the template [5,9]. However, the above mentioned problem of simultaneously recovering from rotation and change of aspect ratio still exists.
A recent proposal [10] aims at overcoming the above problem using the general affine transform paradigm. However, the necessity to spend the bounded watermark energy for an extra template, and the threat that attackers would remove template peaks, led to the use of a self-reference method based on the ACF [6,7] which utilizes the same affine paradigm. A similar approach based on the ACF for the identification of the geometrical transforms in non-watermarking applications was proposed in [11]. In [7] the watermark is replicated in the image in order to create 4 repetitions of the same watermark. This enables to have 9 peaks in the ACF that are used to estimate the undergone geometrical transformations. The descending character of the amplitude of the ACF peaks (shaped by a triangular envelope) reduces the robustness of this approach to geometrical attacks accompanied by a lossy compression. The need for computing two discrete Fourier transforms (DFT) of double image size to estimate the ACF also creates some problems for real time application in the case of large images.
The above considerations show the need for being able to estimate the undergone affine transformations. The algorithms for performing this estimation can be divided into 2 categories:                algorithms based on log-polar and log-log mapping [4,5];        algorithms performing some constrained exhaustive search aiming at the best fitting of a reference pattern with the analyzed one [7,10].        
These approaches have several drawbacks from the robustness, uniqueness and computational complexity points of view. Methods in the first category are able to estimate rotation and scaling based on log-polar map (LPM). The log-log map (LLM) enables estimation of changes in aspect ratio [5,12]. However, estimation of several simultaneous transformations or general affine transforms cannot be accomplished. Moreover, these methods are quite sensitive to the accuracy of the mapping and distortions introduced by lossy JPEG compression. In the second category, the approach proposed by Shelby Pereira and Thierry Pun is potentially able to recover from general affine transformations. However, it is based on a constrained exhaustive search procedure and when the number of reference points increase the computational complexity could be also quite high in order to verify all combinations of the sets of the matched points. Also, results reported in [13] show that the efficiency is not very high against scaling when using the Fourier magnitude template as a reference watermark. Moreover, false points in the magnitude spectrum, due to lossy compression or any other distortion, can cause artifacts in the detected local peaks that will considerably complicate the search procedure, and therefore resulting in a lower robustness of the watermarking algorithm in general. It is necessary to mention that the a priori information about the specific regular geometry of the template points was not used in this approach. The template consists of a random set of points located in the spatial mid-frequency band of the images.
To overcome the above mentioned difficulties we propose to utilize the information about the regular structure of the template, or the ACF or the MS of the periodically repeated watermark [8]. This enables to consider a template with a periodical structure, or the spectrum of the periodically repeated watermark as a regular grid or as a set of lines with a given period and orientation. Therefore, keeping in mind this discrete approximation of the grid of lines one can exploit a Hough transform (HT) [14] or a Radon transform (RT) [15] in order to obtain a robust estimate of the general affine transform matrix.
This approach has a number of advantages in comparison with the previous methods. First, it is very general, which makes it possible to estimate and recover from any affine transformation or combination of sequentially applied affine transformations. Moreover, the false peaks or outliers on the grid due to lossy compression or possibly to other attacks do not decrease the robustness of the approach due to the redundancy of the peaks. Therefore, the proposed approach is tolerant even to very strong lossy compression, which is not the case for the known methods. Finally, the strict mathematical apparatus of the HT or RT alleviates the need for an exhaustive search.
Martin Kutter and Chris Honsinger [7,11] proposed to use the ACF to find the possible geometrical modifications applied to the image. The reported results rely on 2 or 4 repetitions of the same mark. It should be noted that this approach can generate 3 or 9 peaks respectively in the ACF. Therefore, with such a small number of peaks any compression or other signal degradation artifact can cause an ambiguity in the estimation of the affine transform parameters. Oppositely, the MS approach proposed earlier by us can result in a higher robustness due to the high redundancy even in the above case; if the watermark has been embedded many times, one can also use the ACF to get an accurate approximation of the underlying regular structure.
Finally we want to mention that we further proposed an extension of our approach aiming at resistance to non-linear or local random distortions introduced by the random bending attack (RBA) [1,16]. In that situation, the RBA can be also expressed in term of a number of local affine transforms. Therefore the determination of the undergone transformation at the global level could be combined with the recovering from RBA at the local level.
From the above review we can conclude that the existing technologies exhibit at least one of the following problems:    1. Inability to recover from general affine transformations.    2. High computational complexity of the exhaustive search in the case of many reference points.    3. Low robustness against geometrical transformations accompanied by the lossy JPEG or wavelet compression.    4. Inability to recover from the combination of several affine transforms.    5. Lack of protection against intentional template removal, this especially when the number of reference points is comparatively small (less than a hundred).