Researchers have used normalized cross-correlation values to determine, for example, how two images align with one another. For example, if two images are taken of the same object, it may be helpful to align the images to find points of commonality between the images. To determine the alignment which provides the best match between the images, the first image is shifted relative to the second image, pixel by pixel. For each possible shift, a normalized cross-correlation (NCC), ρ, is calculated. Once an NCC value is calculated for each possible shift, the maximum NCC value is identified and provides a determination of the best alignment between the images. This maximum NCC value is identified as ρ(speak); i.e. the normalized cross correlation associated with the shift resulting in the maximum NCC value. The distribution of NCC values will depend on the composition of the images, or more specifically, their pixel values. It is beneficial to evaluate the NCC values independent of the image composition to allow for the selection of a universal match threshold across images. An evaluation independent of the image composition is estimated by calculating a correlation energy (CE). The CE is provided by a ratio of the NCC value associated with a particular shift to a noise floor, where the noise floor represents an average of all NCC values associated to non-matching alignments of the images. Once a CE value for each possible shift has been determined, the maximum CE value identifies the shift associated with the best alignment of the images. Preferably, the CE for a particular shift will be far greater than the CE associated with all other shifts. In this way, it is easy to identify which shift provides the best alignment between the images.
In calculating the noise floor, later researchers observed that the NCC value for a particular shift was often similar to the NCC values for neighboring shifts. Similar NCC values for neighboring shifts were observed because the characteristics of an image at neighboring pixels is often similar. An image 20 illustrating this observation is provided in FIG. 1 where characteristics of the image pixels at a first location 22 are compared to characteristics of the image pixels at a second location 24. FIG. 1A illustrates enlargement of the image 20 at the first location 22 to illustrate characteristics of the image pixels 22a at this location. FIG. 1B illustrates enlargement of the image 20 at the second location 24 to illustrate characteristics of the image pixels at this location. It is to be understood that the enlargements 22a, 24a of the locations 22, 24 are not drawn to scale and are only used for illustrative purposes. FIG. 1A illustrates central pixel A along with neighboring pixels B1-B8. FIG. 1B illustrates central pixel C5 along with neighboring pixels C1-C4 and C6-C9. As illustrated in FIG. 1A, the characteristics of image pixel A are similar to the characteristics of the neighboring image pixels B1-B8, and the characteristics of image pixel C5 are similar to the characteristics of the neighboring image pixels C1-C4 and C6-C9. In contrast, the characteristics of image pixel A are not similar to the characteristics of the distant image pixels C1-C9 and the characteristics of image pixel C5 are not similar to the characteristics of the distant image pixels A and B1-B8. The similarity of the image characteristics at neighboring image pixels results in similar NCC values for neighboring pixels. For example, if the NCC value at pixel A is high, the NCC values associated with pixels B1-B8, will likely also be high, artificially inflating the noise floor calculation. Consequently, all of the CE values calculated from the NCC values will be similar (i.e. will fall within a relatively small range), making it difficult to identify the best alignment of the images, particularly when a visualization tool is utilized to make this identification.
To compensate for the similarity of the neighboring image pixels, researchers provided for a calculation of the noise floor which eliminates NCC values for shifts neighboring a chosen shift value. Specifically, rather than using all of the NCC values to calculate the noise floor, the noise floor is calculated using the NCC values corresponding to shifts outside of a chosen neighborhood of a shift. The NCC values are calculated for each possible shift, however, in calculating the noise floor for a selected shift, the NCC value for shifts associated with an 11 pixel by 11 pixel neighborhood surrounding the selected pixel shift are not included. The 11×11 pixel neighborhood is provided by an 11×11 portion of the shifted image, where the 11×11 neighborhood is a square centered at the selected pixel. See FIG. 1A for an example of a 3×3 neighborhood centered at pixel A. Thus, a shift-specific noise floor is provided for each possible shift.
Utilizing the shift-specific noise floor, the CE for each possible shift is calculated. Although the calculation of each CE requires the calculation of a shift-specific noise floor, the resulting CE values provide a clearer distinction between the higher and lower NCC values and therefore provides an easier indication of the best alignment of the images.
The CE calculation utilized to align images was later used in camera fingerprinting. When a digital camera takes an image, the shutter opens, and light from the scene photographed travels through the camera lens and is registered on the imaging sensor, such as a charge coupled device. Ideally, the amount of charge output at each pixel on the imaging sensor should represent the true light intensity entering the camera. However, natural physical variations in the imaging sensor imprint a distinctive sensor pattern noise, or “fingerprint” on the image.
Just as a fingerprint uniquely identifies an individual, sensor pattern noise on an image uniquely links the image to a specific physical camera. Researchers established that it is possible to extract a sensor pattern from image noise that is unique to a particular camera, acting as the camera's “fingerprint.” Their method for linking a digital image to a particular camera is stable and robust.
Ii,j is utilized to denote the image signal at a pixel (i, j) in one color channel, where i=1 . . . n, j=1 . . . m, and m, n are the image dimensions. I0 is utilized to denote the ideal sensor output in the absence of noise or imperfections, i.e., the “true scene” or noiseless image. Dropping indices for readability, a sensor output is represented as:I=I0+I0K+θwhere K is responsible for the camera's sensor fingerprint and θ is a statistical noise term. It is noted that all operations are element-wise, as opposed to a matrix operation. K represents the camera's photo response non-uniformity (PRNU) factor and can be estimated from a set of images. An estimate of K, which can be denoted as {circumflex over (K)} serves as the camera's fingerprint. As illustrated in FIG. 2, a plurality of images 30 are utilized to produce an estimated fingerprint 32. A portion of the estimated fingerprint 32 is enlarged at 34 to illustrate the variation in the values of the fingerprint pixels.
Once a camera fingerprint has been determined, the likelihood that a given image (i.e., a query image) was taken with that camera, is tested using a two-channel hypothesis. The notation, K1, represents the fingerprint of the camera in question. The notation, K2, represents the fingerprint of a hypothetical second camera used to take the query image (i.e. the query fingerprint). A first hypothesis H0 provides that if K1≠K2, then the query image was not taken from the camera in question. The second hypothesis H1, provides that if K1=K2, then the query image was taken from the camera in question. For simplicity, it is assumed that the query image has not undergone any geometrical processing, except perhaps cropping. To determine whether the query fingerprint is a good match with the estimated camera fingerprint, {circumflex over (K)}1, it is first determined how the possibly cropped query image overlays or aligns with the camera fingerprint, {circumflex over (K)}1. That is, the shift of the query fingerprint relative to the camera fingerprint or vice versa (sx, sy) that will best align the two fingerprints. sx represents a shift in the horizontal direction and sy represents a shift in the vertical direction.
Utilizing the NCC calculations which provided earlier researchers with a tool for aligning two images, the NCC calculations are utilized to align the camera fingerprint, K1, with the query fingerprint K2. It is noted that the notation, {circumflex over (K)}, has been replaced with K for simplification. The NCC, ρ, between the two fingerprints is calculated for all possible shifts utilizing:
      ρ    ⁡          (                        s          x                ,                              s            y                    ;                      K            1                          ,                  K          2                    )        =                    ∑                  i          =          1                m            ⁢                        ∑                      j            =            1                    n                ⁢                              (                                                            K                  1                                ⁡                                  [                                      i                    ,                    j                                    ]                                            -                                                K                  1                                _                                      )                    ⁢                      (                                                            K                  2                                ⁡                                  [                                                            i                      +                                              s                        x                                                              ,                                          j                      +                                              s                        y                                                                              ]                                            -                                                K                  2                                _                                      )                                                                            K            1                    -                                    K              1                        _                                      ⁢                                            K            2                    -                                    K              2                        _                                      
The highest NCC value should identify the best alignment between a query fingerprint and a camera fingerprint. As noted above in connection with the process for aligning two images, a correlation energy ratio (CE) provides for normalization of the NCC value relative to a noise floor and therefore depends less on the image composition than would a simple comparison of the NCC values. Similarly when comparing fingerprints, it is preferable to utilize a CE value rather than a NCC value in order to diminish the impact of the image composition. In addition, use of the CE values allows for selection of a universal threshold to be used for all fingerprint comparisons.
Following the earlier method for matching images, which provided for the elimination of neighboring NCC values from the noise floor calculation, researchers working with fingerprints again eliminated the neighboring NCC values from the noise floor calculation. The correlation energy for each shift was calculated utilizing:
      CE    =                            ρ          ⁡                      (                                          s                x                            ,                                                s                  y                                ;                                  K                  1                                            ,                              K                2                                      )                          2                              1                      mn            -                                        N                                                    ⁢                              ∑                          s              ,                              s                ∉                Ns                                              ⁢                                    ρ              ⁡                              (                                                      s                    x                                    ,                                                            s                      y                                        ;                                          K                      1                                                        ,                                      K                    2                                                  )                                      2                                where            1              mn        -                            N                                ⁢                  ∑                  s          ,                      s            ∉            Ns                              ⁢                        ρ          ⁡                      (                                          s                x                            ,                                                s                  y                                ;                                  K                  1                                            ,                              K                2                                      )                          2            provides a calculation of the noise floor and Ns identifies shifts in a neighborhood of shift s. Thus, similar to the method for comparing images, the method for comparing fingerprints provides for the calculation of a shift-specific noise floor for each possible shift. More specifically, NCC values were calculated for all possible shifts; a neighborhood surrounding the shift was identified; for each shift, a shift-specific noise floor was calculated (eliminating the NCC values for neighboring shifts); and the CE value was calculated at each shift using the shift-specific noise floor.
Once the CE values at each shift are calculated, the CE values are then compared to determine the maximum, or peak CE value, PCE. This PCE value identifies the shift providing the best alignment of the camera fingerprint and the query fingerprint. A CE threshold is set, and the PCE is compared to the CE threshold. In the event the PCE exceeds the CE threshold, the hypothesis, H0, is rejected and it is determined that the query image originated from the camera in question; i.e., the hypothesis H1, is supported. Researchers at Binghamton University empirically found a CE threshold of 60 to be a reasonable value for matching an image to a camera fingerprint.
In some cases, visualization tools are utilized to analyze the CE values. FIG. 3 illustrates the use of a visualization tool to analyze CE values. The visualization tool provides a plot 50 wherein the CE value associated with each shift is represented by plotting a vertical line 52 at a location associated with the shift. The height of the plotted vertical line 52 represents the CE value. It is preferable to plot the CE values associated with all shifts on a single plot 50 in order to make comparisons between the CE values. In order to plot all CE values, the vertical line 52 associated with each CE value must have limited width. The resulting CE graph provides a large number of dense vertical lines 52 of limited width. This density and limited width makes it difficult to identify the PCE. For example, in FIG. 3 the PCE is provided at the location associated with shift (0, 0). Given the nature of the PCE plot, however, it is difficult to identify the PCE.
It is noted that this method for comparing a camera fingerprint and a query fingerprint requires a significant number of calculations. In particular, in order to calculate the CE value for a particular shift, a shift-specific noise floor must first be calculated. Further, the shift-specific noise floor requires identification of shift-specific neighboring NCC values. It is further noted that the visualization tool utilized for evaluating the CE values does not allow for the determination of a PCE with ease. An improved method for comparing camera and query fingerprints is therefore required.