In various imaging applications, it is desirable to tessellate sub-images derived from a sequence of partial views in order to obtain a larger view than each sub-image can provide. In applications where two-dimensional image data are acquired via one imaging modality, and where depth information is obtained via a distinct imaging modality of comparable resolution to the first imaging modality, no one has ever considered the problem of how the spatially-resolved depth information of the distinct modalities could be combined to form an accurate three-dimensional representation. To address that problem, the present invention, as described below, is advanced.
Mosaicking techniques have been developed and applied in various contexts. Biomedical image mosaicking is reviewed in Pol et al., “Biomedical image mosaicking: A review,” in Communication and Computing Systems: Proc. Int. Conf. on Comm. and Computing Systems (ICCS-2016), (Prasad et al., ed.), pp. 155-57 (2017), which is incorporated herein by reference.
Despite general agreement that the thickness of the human tympanic membrane (TM) varies considerably across different regions of the TM, very limited data on the TM thickness distribution are available in the literature, all derived from ex vivo specimens. Yet, TM thickness distribution is one of the key parameters in mathematical modeling of middle-ear dynamics. Such models play a fundamental role not only in advancing our understanding of the hearing process but also in designing ear prostheses. In the absence of adequate data on TM thickness distribution, most mathematical models tend to make overly simplified assumptions regarding the thickness of the TM; in some cases, to the extreme of assuming a single thickness value across the entire membrane.
TM thickness also provides valuable information about the state and functioning of the middle-ear, and is known to provide diagnostically useful information about several middle-ear pathologies. For example, it has been shown that the thickness of the TM in healthy human subjects is significantly different from the thickness in subjects with acute and chronic otitis media, as shown by Monroy et al., “Non-invasive Depth-Resolved Optical Measurements of the Tympanic Membrane and Middle Ear for Differentiating Otitis Media,” The Laryngoscope, vol. 125, pp. E276-82 (2015), which is incorporated herein by reference. A reliable method of determining in vivo TM thickness distributions could, therefore, also enable a more comprehensive diagnosis of various otologic diseases.
LCI is a well-known optical coherence technique capable of measuring one-dimensional depth-resolved tissue structure with a typical resolution of several microns. However, combining depth-profile information obtained at multiple points of an irregular (and, potentially, moving) surface into a consistent three-dimensional image presents a problem that requires solution.
Most image mosaicking techniques begin with an assumption of a suitable motion model describing the alignment between a pair of images. The motion model is characterized by a 2-D transformation matrix, which describes the coordinate transformation from one image to the next. Once the motion model is chosen, the parameters of the model are estimated by following either an intensity-based approach or a feature-based approach, taught by Szeliski, “Image alignment and stitching: A tutorial,” Foundations and Trends in Computer Graphics and Vision, vol. 2, pp. 1-104 (2006)v (hereinafter “Szeliski 2006”), which is incorporated herein by reference.
2006). In intensity-based methods, the model parameters are estimated by optimizing a suitable similarity metric representing the difference between pixel intensities of an image pair. Commonly used metrics include mean squared error, cross-correlation, or mutual information. Because the metric directly depends on pixel intensity values, these methods are sensitive to image deterioration resulting from various factors such as non-uniform illumination and defocus.
Feature-based techniques rely on matching landmark points between images. In these techniques, a set of matching image features such as edges, corners or other geometrical structures are first extracted from the images, and subsequently, the optimal image registration parameters are obtained by maximizing a similarity measure computed from the matched features. Some of the more popular feature matching methods include a Scale-Invariant Feature Transform (SIFT), as taught by Lowe, “Object recognition from local scale-invariant features,” Computer Vision, pp. 1150-57 (1999), and Speeded Up Robust Features (SURF), as taught by Surf, et al., “Speeded Up Robust Features,” Computer Vision—ECCV 2006, pp. 404-17 (2006), both of which publications are incorporated herein by reference. Unlike intensity-based methods, feature-based methods do not directly depend on the actual pixel values in an image, but rather on image features, which makes these methods more robust to variations in image quality. The performance of feature-based methods, however, largely depends on reliable detection of matched image features, which is challenging in cases when the images lack sharp distinctive features.
Several image registration techniques have been reported in the biomedical literature, mostly for retinal imaging, as listed here and as incorporated herein by reference.                Can et al., “A feature-based, robust, hierarchical algorithm for registering pairs of images of the curved human retina,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 24, pp. 347-64 (2002);        Yang et al., “Covariance-driven mosaic formation from sparsely overlapping image sets with application to retinal image mosaicking,” Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (2004);        Chanwimaluang et al., “Hybrid retinal image registration,” IEEE Trans. Inf. Tech. in Biomedicine, vol. 10, pp. 129-42, (2006);        Jupeng et al., “A robust feature-based method for mosaic of the curved human color retinal images,” IEEE International Conf on Biomedical Engineering and Informatics, 2008, vol. 1, pp. 845-49, (2008); and        Li et al., “Automatic montage of SD-OCT data sets,” Opt. Exp., vol. 19, pp. 239-48 (2011).        
The foregoing techniques, however, are not directly applicable to TM image mosaicking for two main reasons. First, unlike retinal images, which have several distinctive features such as the bifurcations and crossovers of the blood vessels, TM images predominantly contain large homogeneous, nonvascularized regions lacking in sharp features. Second, due to the specular nature of TM, the spatial distribution of intensity, both within and between surface images of the TM, are widely heterogeneous depending on the angle and distance of the imaging probe.
Accordingly, novel mosaicking techniques are required, and those are described herein with reference to a device that may advantageously provide direct, accurate three-dimensional images of a tympanic membrane.