Feature detector and detecting methods have grown increasingly popular in the field of image processing and computer vision in recent years. Affine covariant feature detectors especially became popular due to their broad application such as object recognition and tracking, image reconstruction, geo-mosaicking and many more. Compared to other image feature detectors, maximally stable extremal region (MSER) detectors perform well. A detected MSER (typically comprised of one or more detected regions) is closed under continuous geometric transformations and is invariant to affine intensity changes. See J. Matas, O. Chum, M. Urban, and T. Pajdla. “Robust Wide Baseline Stereo from Maximally Stable Extremal Regions,” in Proc. British Machine Vision Conference BMVC, vol. 1, pp. 384-393. 2002, incorporated herein by reference, and hereinafter referred to as “Matas_2002.” Due to its desired properties, use of MSER detectors has found application in wide variety of computer vision problems such as and object searching in video.
Both video and image resolution have grown significantly over the years. Currently, videos, and pictures from video cameras that each exceed 10 megapixels are common. For running computer vision and image processing method on videos and images with high resolution, more processing power is required. As a result, fast MSER algorithms have been developed and implementations are known for a sequential processor (CPU) and for a field programmable gate array (FPGA).
Parallel processing has become popular because of the availability of parallel processors such as graphics processing units (GPUs) sometimes called video processing units (VPUs) that include up to thousands of (in the future, possibly even more) processing elements in a parallel processor are each capable of computation on sets of data in parallel, and are referred to as “arithmetic logic units,” “ALUs,” and “cores”. One reason general-purpose parallel processing using GPUs has become popular is the ease of programming GPUs because of the availability of set of programming models and related programming tools for some GPUs that a usable in combinations of at least one parallel processor and at least one general purpose computing element. One example programming model is CUDA™, created by NVIDIA® and available for many NVIDIA-produced GPUs. Another example is OpenCL (Open Computing Language), an open (royalty-free) standard for general purpose parallel programming across CPUs, GPUs and other processing devices. Both CUDA and OpenCL provide for software developers portable and efficient access to the power of these heterogeneous processing platforms. CUDA, for example, gives developers access to a virtual instruction set and memory of the parallel processing elements in CUDA-supported GPUs, such that GPUs become accessible for parallel computation to a connected CPU and a programming instructions running thereon. Unlike CPUs, however, GPUs have a parallel throughput architecture that emphasizes executing many threads concurrently.
The above-mentioned (CPU- or FPGA-implemented) fast MSER algorithms may not naturally fit for implementation on parallel machines such as a GPU.