A. Santos, “Evaluation of autofocus functions in molecular cytogenetic analysis”, Journal of Microscopy, Vol 188, Pt 3, December 1997, pp 264-272 assesses a number of known sharpness measures. These can be classified into five main groups as follows:                A. Functions based on image differentiation such as:                    1. Threshold absolute gradient                        
                    ⁢                  Sh                  th          -          grad                    =                        ∑          M                ⁢                                  ⁢                              ∑            N                    ⁢                                          ⁢                                                                g                ⁡                                  (                                      i                    ,                                          j                      +                      1                                                        )                                            -                              g                ⁡                                  (                                      i                    ,                    j                                    )                                                                                                        ⁢    while                                                    g            ⁡                          (                              i                ,                                  j                  +                  1                                            )                                -                      g            ⁡                          (                              i                ,                j                            )                                                  >      thr        ,          where      ⁢                          ⁢              g        ⁡                  (                      i            ,            j                    )                    ⁢                          ⁢      is      ⁢                          ⁢      the      ⁢                          ⁢      gray      ⁢                          ⁢      level      ⁢                          ⁢      of      ⁢                          ⁢      pixel      ⁢                          ⁢              (                  i          ,          j                )                                                2. Tenengrad function                        
      Sh    tenengrad    =            ∑      M        ⁢                  ⁢                  ∑        N            ⁢              T        ⁡                  [                      g            ⁡                          (                              i                ,                j                            )                                ]                                                                                    where T[g(i,j)] is the square of the gradient value in pixels (i, j)                                                B. Functions based on depth of peaks and valleys        C. Functions based on image contrast        D. Functions based on histogram        E. Functions based on correlation measurements including:                    Vollath's F4 (based on the autocorrelation function, very good performance in presence of noise)                        
      Sh          VollathF      ⁢                          ⁢      4        =                    ∑                  i          =          1                          M          -          1                    ⁢                          ⁢                        ∑                      j            =            1                    N                ⁢                                  ⁢                  (                                    g              ⁡                              (                                  i                  ,                  j                                )                                      ·                          g              ⁡                              (                                                      i                    +                    1                                    ,                  j                                )                                              )                      -                  ∑                  i          =          1                          M          -          2                    ⁢                          ⁢                        ∑                      J            =            1                    N                ⁢                                  ⁢                  (                                    g              ⁡                              (                                  i                  ,                  j                                )                                      ·                          g              ⁡                              (                                                      i                    +                    2                                    ,                  j                                )                                                        
All these functions perform pixel level computations providing an instant sharpness value for a given image or a region of interest (ROI) within an image. In order to determine a best focus position for an image or a region of interest (ROI) within an image, a focus sweep must be executed so that the focus position indicating the highest sharpness can be chosen for acquiring an image. Performing such a focus sweep including assessing each image to determine an optimal focus position can involve a significant delay which is not acceptable, especially in image acquisition devices where the ability to acquire a snap-shot or to track an object in real-time is important.
None of these techniques is able to provide an absolute sharpness value capable of indicating if a region of interest is in focus when only a single image is available, so indicating whether a change in focus position might be beneficial in order to acquire a better image of a scene.
There are also other shortcomings of at least some of the above approaches. Referring to FIG. 1, the top row shows a sequence of images of a face captured with the same face at different distances from a camera ranging from 0.33 m to 2 m. The light level for capturing the images is similar ranging from 3.5 Lux to 2.5 Lux.
Referring to the respective image/graph pairs below the top row of images, the face region from each acquired image from the top row is scaled to a common size and in this case the upper half of the face region is chosen and scaled to provide a 200×100 pixel region of interest. For each of the scenes from the top row, the focus position of the camera lens is shifted by varying a code (DAC) for the lens actuator across its range from values, in this case from 1 to >61 and a sharpness measure is calculated for each position. (Use of such DAC codes is explained in PCT Application No. PCT/EP2015/061919 (Ref: FN-396-PCT) the disclosure of which is incorporated herein by reference.) In this example, a threshold absolute gradient contrast measure such as described above is used. Contrary to human perception, the sharpness measure across the range of focus positions provided for the most distant 2 m image is actually higher than for the largest well-lit face region acquired at 0.33 m. This is because the sharpness measures for the most distant image has been affected by noise.
Referring to FIG. 2, it will also be seen that in some of the above cases, the sharpness measures for an image taken across a range of focus positions, both on focused and on defocused images, in good light (20 Lux) can be smaller than for those taken in low light (2.5 Lux), contrary to human perception, again because of the influence of noise within the image.
A different approach, which doesn't use a reference and provides a quality measure based on an eye band region is described in “No-Reference Image Quality Assessment for Facial Images” Debalina et al, pages 594-601, ICIC'11 Proceedings of the 7th International Conference on Advanced Intelligent Computing Theories and Applications. Debelina does not consider the behavior of this quality measure at various distances to a subject or in different lightning conditions. The method complexity is quite high involving k-mean clustering to separate the eyes from the skin, binary template matching based on cross-correlation to detect the eyes, Just Noticeable Blur (JNB) thresholds to compute the sharpness and this may not be suitable for a hardware implementation.
It is an object of the present invention to provide a sharpness metric which reflects human perception of the quality of a ROI within an image. The metric should be valid for varying light levels including very low light conditions where an acquired image may be quite noisy. The metric should be absolute so that a determination can be made directly from any given image whether it is sufficiently focussed or not i.e. the sharpness value for any well focused image should be higher than the sharpness level for any defocused image irrespective of an ambient luminance value.