The current invention relates generally to the visualization and processing of multidimensional data, and in particular, to data formed from a series of images.
Sophisticated analysis of imaging data requires software that can rapidly identify meaningful regions of the image. Depending on the size and number of regions, this process may require evaluating very large datasets, and thus efficient sorting of the data is essential for finding the desirable elements. In the present invention, regions of interest (ROIs) in previous feature-based imaging spectroscopy are extended to include pixel-based analyses. This requires new algorithms, since the size of a pixel-based analysis can be more than 1000 times larger than that of a feature-based analysis. In addition to requiring a burdensome amount of processing time, prior art sorting algorithms that may have been adequate to categorize and classify relatively noiseless feature data are not necessarily successful in sorting single-pixel spectra without additional parameters or human intervention.
In cases in which human intervention is advantageous, the present invention includes a means for combining machine and human intelligence to enhance image analysis. For example, the present invention provides a method for combining sorting by spectral criteria (e.g., intensity at a given wavelength) and sorting by temporal criteria (e.g., absorbance at a given time). Sorting enables the user to classify large amounts of data into meaningful and manageable groups according to defined criteria. The present invention also allows for multiple rounds of pixel or feature selection based on independent sorting criteria. Methods are presented for extracting useful information by combining the analyses of multiple datasets and datatypes (e.g., absorbance, fluorescence, or time), such as those obtained using the instruments and methods disclosed in U.S. Pat. Nos. 5,859,700 and 5,914,245, and in U.S. patent application Ser. No. 09/092,316.
The methods described herein are useful for a number of applications in biology, chemistry and medicine. Biomedical applications include high-throughput screening (e.g., pharmaceutical screening) and medical imaging and diagnostics (e.g., oximetry or retinal examination). Biological targets include live or dead biological cells (e.g., bacterial colonies or tissue samples), as well as cell extracts, DNA or protein samples, and the like. Sample formats for presenting the targets include microplates and other miniaturized assay plates, membranes, electrophoresis gels, microarrays, macroarrays, capillaries, beads and particles, gel microdroplets, microfluidic chips and other microchips, and compact discs. More generally, the methods of the present invention can be used for analysis of polymers, optical materials, electronic components, thin films, coatings, combinatorial chemical libraries, paper, food, packaging, textiles, water quality, mineralogy, printing and lithography, artwork, documents, remote sensing data, computer graphics and databases, or any other endeavor or field of study that generates multidimensional data.
The present invention provides methods, systems and computer programs for analyzing and visualizing multidimensional data. Typically, the first two dimensions are spatial and the third dimension is either spectral or temporal. (Although the term spectra or kinetics may be used herein, the methods described are of general applicability to both forms of vector data.) The invention includes a graphical user interface and method that allows for the analyses of multiple data types. For example, datastacks of fluorescence emission intensity, absorbance, reflectance and kinetics (changes in signal over time) can be analyzed either independently or on the same sample for the same field of view. Fluorescence measurements involving fluorescence resonance energy transfer (FRET) can also be analyzed. A key feature of the present invention is that data analysis can be performed in series. Thus, for example, the results of sorting pixels or features within one image stack can be applied to subsequent sorts within image stacks. The present invention also includes methods to prefilter data. Thus, for example, pixel-based analysis can be performed, wherein features are selected based on particular criteria and a subsequent sort is restricted to pixels that lie within the selected features. These sorting methods are guided by the heuristics of parameters input by the user. This is especially beneficial when expert knowledge is available. Thus, for example, the user can select a particular spectrum with desirable characteristics (a target spectrum) from a spectral stack, and the program will automatically classify all of the spectra obtained from the image stack by comparing each of the unclassified spectra to the target spectrum, calculating a distance measure, and sorting the spectra based on their distance measure. The classified (sorted) spectra are then displayed in the contour plot window or other plot windows.
Sorting can also be used for sequentially analyzing images and graphical data, such that the pixels that are ultimately displayed are restricted by at least two independent criteria. For example, pixels or features that have been extracted based on selected spectral criteria (e.g., absorbance) can be further analyzed based on temporal criteria (e.g., kinetics). This method of combined analysis provides a means for rapidly and efficiently extracting useful information from massive amounts of data. A further embodiment of sequential sorting involves discarding unwanted data during the sorting process. This xe2x80x98sort and lockxe2x80x99 procedure provides a useful new tool for data compression. This method for sorting and displaying multidimensional data from an image stack comprises the steps of: (a) selecting a subset of pixels from an image by a first algorithm; (b) discarding the pixels that are not selected; (c) selecting a subset of the remaining pixels by a second sorting algorithm; and (d) automatically indicating the final selection of pixels by back-coloring the corresponding pixels in the image. This type of multidimensional analysis can also be performed by manipulating the contour plot window. The method comprises the steps of (a) sorting the pixels by a first algorithm; (b) automatically indicating on the contour plot pixels sorted by the first algorithm; (c) selecting a subset of pixels in the contour plot; (d) sorting the subset of pixels by applying a second algorithm; (e) selecting a reduced subset of pixels in the contour plot; and (f) automatically indicating the final selection of pixels by backcoloring the reduced subset of pixels in the image. The present invention also provides a method for displaying a grouping bar that can be used to analyze images and graphical data within the graphical user interface (xe2x80x9cGUIxe2x80x9d). The grouping bar enables the user to segregate groups of pixels or features within a contour plot, and thereby facilitates independent sorting and backcoloring of the individual groups of pixels or features in the image. The methods of the present invention are applicable to a variety of problems involving complex, multidimensional, or gigapixel imaging tasks, including (for example) automated screening of genetic libraries expressing enzyme variants.
According to one embodiment of the invention, a method for analyzing digital image data is provided, said method comprising (a) loading into a computer memory a plurality of data stacks wherein each data stack comprises pixel intensity data for a plurality of images, the pixel intensity data expressed as a function of: (i) pixel position, (ii) a first non-positional variable, and (iii) a second non-positional variable, wherein within a data stack, the value of the first non-positional variable is not constant and the value of the second non-positional variable is constant, and wherein between data stacks, the value of the second non-positional variable differs; (b) generating for a plurality of pixels within a first data stack, a plurality of first functions that relate pixel intensity to the first non-positional variable; (c) sorting the pixels within the first stack according to a first value obtained by applying a mathematical operation to the first functions generated for the plurality of pixels; (d) selecting a first set of sorted pixels; (e) generating for a plurality of pixels within the first set, a plurality of second functions that relate pixel intensity to the second non-positional variable; and (f) sorting the pixels within the first set according to a second value obtained by applying a second mathematical operation to the second functions generated for the plurality of pixels within the first set. The non-positional variables may be selected from a wide range of different parameter types that indicate, e.g., the time the data were captured, or, e.g., a condition such as wavelength, temperature, pH, chemical activity (such as, e.g., the concentration of an enzyme substrate or enzyme inhibitor, or the concentration of a drug or other chemical component), pressure, partial pressure of a gaseous chemical, or ionic strength, etc. under which the data were captured.
According to another embodiment, the invention provides a graphical user interface (xe2x80x9cGUIxe2x80x9d) for display and analysis of digital image data comprising (a) a reference window for displaying a reference image comprising pixels; (b) a contour plot window for indicating pixel location along a first dimension, indicating a non-positional variable (such as, e.g., time, wavelength, temperature, pH, chemical activity, pressure, partial pressure of a gaseous chemical, or ionic strength, etc.) along a second dimension, and indicating pixel intensity by a variable signal appearing along the second dimension, said contour plot window further comprising (i) a grouping bar for grouping together pixels for analysis,: and (ii) a selection bar for selecting pixels that are thereby indicated in the reference window and plotted in the plot window; (c) a plot window for displaying a plot of pixel intensity as a function of the non-positional variable.