The detection and recognition of targets or objects of interest in a given input still image or video image is a challenging issue for most security and monitoring applications, whether located at airports, shipping ports, office complexes, or other public places. Among several pattern recognition techniques, an optical joint transform correlation (JTC) technique has been found to be a versatile tool for real-time applications. The JTC technique provides a number of advantages over other correlation techniques, such as the Vanderlugt filter, in that it allows real-time updating of the reference image, permits parallel Fourier transformation of the reference image and input scene, operates at video frame rates, and eliminates the precise positioning requirement of a complex matched filter in the Fourier plane.
However, the classical JTC technique suffers from poor correlation discrimination, wide sidelobes, a pair of correlation peaks for each object, and strong zero-order correlation terms, which often overshadow the desired cross correlation peak. A number of modifications have been made in the design of the classical JTC technique, namely binary JTC, phase-only JTC, and fringe-adjusted JTC (FJTC). These may yield improved performance in some cases, but they are not yet successful in yielding sharp correlation with high discrimination between target and non-target objects present in the input image and operating in noisy conditions.
A recently developed shifted phase-encoded fringe-adjusted JTC (SPFJTC) technique has been found to be efficient and successful in yielding distinct correlation performance with a single delta-function-like correlation peak that has a high level of discrimination between the target and the non-targets or background; thus, it can operate in a noisy environment. A class-associative target detection system can be developed using the SPFJTC technique for simultaneous recognition of multiple reference objects in the same input scene, where the processing architecture and parameters do not need to adjust depending upon the number of members in the target class and type of the input scene. Though the technique works in noisy cases, it is not yet invariant to distortions in the input scene, such as illumination, scale and rotation variations.