1. Field of the Invention
The subject matter described herein relates generally to labeling samples in an array.
2. Background of the Invention
An array is an ordered arrangement of any kind of subject matter. In the biological sciences, an array is generally a two-dimensional arrangement of samples placed upon a support structure. Samples can include, but are not limited to, nucleic acids, proteins, molecules, cells, tissues and any combination thereof. These arrays allow for the efficient and rapid processing of large numbers of samples, allowing laboratories to process thousands of samples a day. For example, a microplate configured with a four by four matrix of biosites or samples in each of the 96 wells of a microtiter plate would be able to perform a total of 1536 nearly simultaneous tests utilizing a proximal CCD imager. A microplate configured with a 15 by 15 matrix of samples in each of the 96 wells enables a total of 21,600 nearly simultaneous reactions to be processed.
Arrays of samples are used in many forms and for many areas of science, including, but not limited to, 96 well plates or slides for combinatorial chemistry, multi-well carriers for synthesizers, and papers, such as nitrocellulose and nylon, for hybridization reactions. Array technology can be used with almost any clinical or research protocol. For example, in screening libraries, the library, consisting of recombinant clones or molecules, can be placed in two-dimensional arrays on supports, examples of supports can include a microtiter plate or microscope slide. Each clone or molecule can be identified by the identity of the plate and the clone or molecule location (row and column) on that plate. The arrayed libraries can then be used for many applications such as screening for a specific gene of interest or for identifying potential lead compounds for treating diseases. Arrays can also be used to diagnosis diseases as well as synthesize nucleic acids, polypeptides, and chemical compounds. As another example, arrays of tissue can be arranged on a microscope slide to simultaneously test an experimental treatment for diseased tissues.
Current designs rely on the wells or dots being in predictable positions so they can be processed or read by robotic equipment. If, for any reason, the positioning is not as expected, a possible response by a computerized system is to shut down processing of the array. Augmenting a vision system with a method for accommodating irregular arrays would allow robotic systems of this type to recover from minor positioning errors.
Examples of arrays are shown in FIGS. 1 and 4. FIGS. 1 and 4 depict slides with samples 2 thereon. In the case where the samples are of tissue sections, tissue samples may be embedded in a block of paraffin. Successive slices of paraffin and tissue may then be mounted on the slide or series of slides. The slides with the samples, tissues in paraffin, can then be subjected to a battery of test.
Once the slides of tissue are ready to be analyzed, the slides can be scanned to produce digitized images. The digitized images of the stained slides can then be automatically processed. However, before processing proceeds, it may be desirable to identify the regions of the image corresponding to each tissue sample. The digitized image may be analyzed to locate connected regions forming the tissue samples and their centroids. Once the tissue samples and/or their centroids are located, they can be assigned coordinates. Assigning coordinates to the samples may facilitate later data analysis and allow one to return to a tissue sample of interest. While the identification of the samples and/or their centroids can be preformed prior to processing the samples in a desired protocol, identification of the samples and their centroids can also be done after the processing of the samples.
In identifying and assigning coordinates to the samples, several image irregularities may hinder machine determination of the row and column coordinates of the centroids. Reasons for the irregularities range from human error to shearing of the paraffin during handling. The irregularities make it difficult to know with certainty the correct labeling in the areas of these irregularities.
Even when a guide is employed, the regularity with which the tissue can be placed into the paraffin may be less than perfect. The tissue sections may be subject to deformation during the transfer to the slide, and some samples may fail to adhere to the slide. In cases where samples in liquid are applied to a support of the array, the liquid dispenser may inadvertently omit samples, for example, failure to pick up the liquid sample, or introduce extraneous “samples,” for example, the liquid sample may inadvertently drip from the liquid dispenser or stray marks are introduced by mishandling the slide. Moreover, rows and/or columns, or portions thereof, of samples may be intentionally omitted to separate different groups of samples.
The digitization process introduces further noise to challenge the row and column identification. The noise level may be sufficiently high that a simple round to the nearest integer inncols*X/Xmax  (1)does not reveal the centroid's column coordinate, where X is the x coordinate of the centroid in question, Xmax is the greatest x coordinate of all the centroids, and ncols is the number of columns on the slide.
FIGS. 1-6 illustrate some problems in identifying the coordinates of the centroids. FIGS. 2 and 3 and FIGS. 5 and 6 are exploded views of exemplary problem regions in FIGS. 1 and 4, respectively. An array of (x,y) pairs represents the position of the centroids of the tissue samples on the slides. FIG. 1 appears to be fairly regular to the eye with well-defined rows and columns, but zooming into the array illustrates problem areas. FIG. 2 illustrates problems due to missing samples and deviation of samples from straight columns. FIG. 3 illustrates two different problems: two samples lay in what seems to be a missing row (third row from the top) and the correspondence between the bottom rows in the left half and the right half of the image is not clear.
The problem of sparse data is evident in various regions of FIG. 4, which are shown in exploded views in FIGS. 5 and 6. In FIGS. 5 and 6, it is difficult to determine the row and column numbers of the samples, especially when using an algorithm based on local structure.