This invention relates to imaging for biochemical analysis and more particularly to methods and systems for imaging high density biochemical arrays used in high-throughput genome sequencing.
High-throughput analysis of chemical and/or biological species is an important tool in the fields of diagnostics and therapeutics. Biochemical arrays allow multiple biochemical experiments to be performed in parallel. This ability accrues from the development of techniques to perform each experiment in a small volume and to pack the experiments closely together. Arrays of attached chemical and/or biological species on a substrate can be designed to define specific target sequences, analyze gene expression patterns, identify specific allelic variations, determine copy number of DNA sequences and identify, on a genome-wide basis, binding sites for proteins (e.g., transcription factors and other regulatory molecules). In a specific example, the advent of the human genome project required that 25 improved methods for sequencing nucleic acids, such as DNA (deoxyribonucleic acid) and RNA (ribonucleic acid), be developed. Determination of the entire 3,000,000,000 base sequence of the haploid human genome has provided a foundation for identifying the genetic basis of numerous diseases. However, a great deal of work remains to be done to identify the genetic variations associated with a statistically significant number of human genomes, and improved high throughput methods for analysis can aid greatly in this endeavor.
The high-throughput analytical approaches conventionally utilize assay devices, known as flow cells that contain arrays of chemicals and/or biological species for analysis. The biological species are typically tagged with multiple fluorescent colors that can be read with an imaging system.
Due to the sheer volume of data to be observed, captured and analyzed, a critical factor in genome sequencing analysis is the throughput of the assaying instrument. Throughput has a direct impact on cost. While imaging systems are capable of capturing a large amount of data as compared to other technologies, the throughput of such systems is limited by camera speed and number of pixels per spot. Camera speed is limited by inherent physical limitations, and the smallest number of pixels per spot is one. While it is desirable to reduce number of pixels per spot to a minimum, there are typically many pixels per spot in practical instruments.
Images captured in pixels from light emitted from spots associated with attachment sites on a substrate must be aligned and registered in order to be analyzable. The conventional registration technology, which involves registration marks and guides on the substrate, requires space on the substrate, reducing number of sites available for analysis and thus the volume of analysis per unit time.
Several different approaches to DNA chips are under development. In one approach a combinatorial array of DNA fragments is created on a chip and these are used for sequencing by hybridization. In another, DNA is randomly arrayed on a surface for the same purpose. One research group is trying to use arrays of DNA polymerase to observe sequencing base by base. Still another research group uses self-assembled DNA nanoarrays interrogated by combinatorial probe-anchor ligation. Although these approaches are quite different from one another, especially in their biochemical details, they all depend on fluorescence imaging techniques to literally “see” the data generated by individual experiments in an array.
Fluorescence imaging is used to identify DNA bases—A, C, G, or T—by designing biochemical reactions such that a different colored dye (for example, red, green, blue, or yellow) corresponds to each one. One may then observe a DNA experiment with a fluorescence microscope. The color observed indicates the DNA base at that particular step. Extracting data from a DNA chip thus depends on recording the color of fluorescence emitted by many millions or even billions of biochemical experiments on a chip.
The practice of the techniques described herein may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, and sequencing technology, which are within the skill of those who practice in the art. Such conventional techniques include polymer array synthesis, hybridization and ligation of polynucleotides, and detection of hybridization using a label. Specific illustrations of suitable techniques can be had by reference to the examples herein. However, other equivalent conventional procedures can, of course, also be used. Such conventional techniques and descriptions can be found in standard laboratory manuals such as Green, et al., Eds. (1999), Genome Analysis: A Laboratory Manual Series (Vols. I-IV); Weiner, Gabriel, Stephens, Eds. (2007), Genetic Variation: A Laboratory Manual; Dieffenbach, Dveksler, Eds. (2003), PCR Primer: A Laboratory Manual; Bowtell and Sambrook (2003), DNA Microarrays: A Molecular Cloning Manual; Mount (2004), Bioinformatics: Sequence and Genome Analysis; Sambrook and Russell (2006), Condensed Protocols from Molecular Cloning: A Laboratory Manual; and Sambrook and Russell (2002), Molecular Cloning: A Laboratory Manual (all from Cold Spring Harbor Laboratory Press); Stryer, L. (1995) Biochemistry (4th Ed.) W.H. Freeman, New York N.Y.; Gait, “Oligonucleotide Synthesis: A Practical Approach” 1984, IRL Press, London; Nelson and Cox (2000), Lehninger, Principles of Biochemistry 3rd Ed., W. H. Freeman Pub., New York, N.Y.; and Berg et al. (2002) Biochemistry, 5th Ed., W.H. Freeman Pub., New York, N.Y., all of which are herein incorporated in their entirety by reference for all purposes.
As used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a channel” refers to one or more channels available on an assay substrate, and reference to “the method” includes reference to equivalent steps and methods known to those skilled in the art, and so forth.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. All publications mentioned herein are incorporated by reference for the purpose of describing and disclosing devices, formulations and methodologies that may be used in connection with the presently described invention.
Where a range of values is provided, it is understood that each intervening value, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either both of those included limits are also included in the invention.
In the following description, numerous specific details are set forth to provide a more thorough understanding of the present invention. However, it will be apparent to one of skill in the art upon reading the present disclosure that the present invention may be practiced without one or more of these specific details. In other instances, well-known features and procedures well known to those skilled in the art have not been described in order to avoid obscuring the invention.