This invention relates in general to a method for molecular fingerprinting. The method can be used for forensic identification (e.g. DNA fingerprinting, especially by VNTR), bacterial typing, and human/animal pathogen diagnosis. More particularly, molecules such as polynucleotides (e.g. DNA) can be assessed or sorted by size in a microfabricated device that analyzes the polynucleotides according to restriction fragment length polymorphism. In a microfabricated device according to the invention, DNA fragments or other molecules can be rapidly and accurately typed using relatively small samples, by measuring for example the signal of an optically-detectable (e.g., fluorescent) reporter associated with the polynucleotide fragments.
More generally, the invention relates to a method of analyzing or sorting molecules such as polynucleotides (e.g., DNA) by size or some other characteristic. In particular, the invention relates to a method of analyzing and/or sorting individual polynucleotide molecules in a microfabricated device by measuring the signal of an optically-detectable (e.g., fluorescent, ultraviolet, radioactive or color change) reporter associated with the molecules. These methods and devices can also be adapted to analyze or sort cells or particles.
The devices and methods of the invention are advantageous, particularly in comparison with conventional gel electrophoresis techniques. For example, the invention provides less costly and more rapid equipment, can use smaller molecular samples, is less labor-intensive and is more readily automated. The invention is also advantageously flexible. Additional functions can be incorporated into the design as desired, such as in-line digestion, separation, etc.
When DNA is broken into fragments using restriction enzymes, each of which cuts the DNA in a known way, the resulting DNA fragments or polypeptides of different sizes produce a unique pattern or profile which can be used to uniquely identify the source of the DNA molecules. In the invention, a reporter or other measurable signal varies as a function of molecule size, and in this way profiles based on size can be efficiently generated and compared, particularly on a small scale and in an automated or semi-automated fashion.
Methods enabling the matching of unidentified tissue samples to specific individuals have wide application in many fields. For DNA fingerprinting, commonly used methods include RFLP analysis (53, 54), variable nucleotide tandem repeats (55), and microsatellites (56). With the possible exception of monozygotic twins, each individual in the human population has a unique genetic composition which can be used to specifically identify each individual. This phenomenon has allowed law enforcement officials to use DNA sequence variation to determine, for example, whether a forensic sample was derived from any given individual. The fields of forensic and medical serology, paternity testing, and tissue and sample origin have seen increasing use of such techniques, including the forensic and diagnostic use of DNA sequence variation, e.g., statistical evaluations based on satellite sequences and variable number of tandem repeats (VNTRS) or amplified fragment length polymorphisms (AMP-FLPS). These methods are being used in crime laboratories, courts, hospitals and research and testing labs. Inclusion probabilities stated by the laboratories performing the analyses in such cases often exceed 1:1,000,000. That is, only one individual in one million is predicted, on a statistical basis, to have a given DNA xe2x80x9cfingerprintxe2x80x9d obtained by analyzing a pattern of DNA fragments generated according to these techniques.
The first implementation of DNA typing in forensics was Jeffreys"" use of a multilocus DNA probe xe2x80x9cfingerprintxe2x80x9d that identified a suspect in a murder case in England. (55) In the United States, DNA profiling has been established using a battery of unlinked highly polymorphic single locus VNTR probes. (57) The use of these batteries of probes permits the development of a composite DNA profile for an individual. These profiles can be compared to databases, for example using the principles of Hardy-Weinberg to determine the probability of a match between a suspect and an unknown forensic sample.
Although these methods have markedly improved the power of the forensic and medical scientists to distinguish between individuals, they suffer from a number of shortcomings including a lack of sensitivity, the absence of internal controls, expense, time intensity, relatively large sample size, an inability to perform precise allele (gene pair) identification, and problems with identifying degraded DNA samples.
For example, the most frequently used method for forensic identification is the xe2x80x9cSouthernxe2x80x9d hybridization technique, which has been widely used in forensic identification and medical diagnosis. Also called a xe2x80x9cSouthern blot,xe2x80x9d this technique treats an extracted molecule (a DNA sample) with a restriction endonuclease, an enzyme that cuts a polynucleotide chain wherever a specific and relatively short sequence of nucleic acids in the chain occurs. Examples of well known restriction enzymes used in this way are the endocucleases HaeIII, EcoRI, HpaI and HindIII. In DNA fingerprinting, restriction sites are typically used to isolate VNTRs (variable number of tandem repeats), which are regions in which a short sequence of DNA has been repeated a number of times. The number of repeating units within these regions vary between individuals, and when cut with a restriction endonuclease result in multiple fragments of different size called] RFLPs (restriction fragment length polymorphisms). These fragments can be used as a xe2x80x9cfingerprintxe2x80x9d because they vary in number and size from one individual to another.
The resulting nucleotide fragments (i.e. the RFLPs) are separated by size via gel electrophoresis, in which different sized charged molecules are separated by their different rates of movement through a stationary gel under the influence of an electric current. Following electrophoresis, the separated nucleotides are denatured and transferred to the surface of a nylon membrane by blotting; the so-called xe2x80x9cSouthern Blotxe2x80x9d. The Southern Blot is then incubated in a solution containing a radioactive single locus probe under conditions of temperature and salt concentration that favor hybridization. (A single locus probe is also called a xe2x80x9cprimer.xe2x80x9d) The locations of radioactive probe hybridization on the Southern Blot are detected and recorded via X-ray film or some other detection technique, thus providing a xe2x80x9cprofilexe2x80x9d of the nucleotide. (Hybridization is used to pull out VNTR fragments, i.e. to separate them from irrelevant fragments.) In this approach, sample DNA is digested, and the resulting fragments are separated by size using gel electrophoresis. The separated fragments are transferred to a membrane by blotting, and are subjected to primer hybridization. (58)
This technique is time-consuming, labor intensive, and the gel may have a limited resolving power, making it potentially difficult to interpret the results. Another disadvantage is that these techniques generally require the use of a polymerase chain reaction (PCR) to multiply the polynucleotide in the sample. That is, the conventional tests are not very sensitive, and require relatively large DNA samples which often are not available. In such cases the sample concentration is increased to a meaningful detectable level by PCR. While this addresses some problems of sensitivity and sample degradation, PCR has been open to challenge because of possible sample contamination, and consequent undesirable amplification of contaminants leading to unreliable results. PCR approaches are also difficult to multiplex. For example, the probes and primers must be chosen with care, and generally only one set can be used. The sample may be consumed by one round of PCR, and different sets of probes or primers may require different reaction conditions, such as temperature. A simpler, more powerful technique is needed, which can accommodate small samples, does not rely on PCR, and which makes use of the most recent advances in DNA technology.
As described herein, the invention addresses these problems. In preferred embodiments a DNA sample is digested, primers are used to extend specifically desired DNA regions (e.g. VNTRs), without successive rounds of PCR, and highly sensitive or specific reporter molecules, such as fluorescently-labeled single nucleotides, are used to efficiently determine the length of the resulting DNA. A microfabricated or microfluidic device may be used to implement these techniques, for example to separate and optically detect labeled fragments.
The identification and separation of nucleic acid fragments by size, such as in sequencing of DNA or RNA, is a widely used technique in many fields, including molecular biology, biotechnology, and medical diagnostics. The most frequently used method for such separation is gel electrophoresis, in which different sized charged molecules are separated by their different rates of movement through a stationary gel under the influence of an electric current. Gel electrophoresis presents several disadvantages, however. The process can be time consuming, and resolution is typically about 10%. Efficiency and resolution decrease as the size of fragments increases; molecules larger than 40,000 base pairs are difficult to process, and those larger than 10 million base pairs cannot be distinguished.
Methods have been proposed for determination of the size of nucleic acid molecules based on the level of fluorescence emitted from molecules treated with a fluorescent dye. See Keller, et al., 1995 (31); Goodwin, et al., 1993 (28); Castro, et. al., 1993 (27); and Quake, et al., 1999 (59). Castro (27) describes the detection of individual molecules in samples containing either uniformly sized (48 Kbp) DNA molecules or a predetermined 1:1 ratio of molecules of two different sizes (48 Kbp and 24 Kbp). A resolution of approximately 12-15% was achieved between these two sizes. There is no discussion of sorting or isolating the differently sized molecules.
In order to provide a small diameter sample stream, Castro (27) uses a xe2x80x9csheath flowxe2x80x9d technique wherein a sheath fluid hydrodynamically focuses the sample stream from 100 xcexcm to 20 xcexcm. This method requires that the radiation exciting the dye molecules, and the emitted fluorescence, must traverse the sheath fluid, leading to poor light collection efficiency and resolution problems caused by lack of uniformity. Specifically, this method results in a relatively poor signal-to-noise ratio of the collected fluorescence, leading to inaccuracies in the sizing of the DNA molecules.
Goodwin (28) mentions the sorting of fluorescently stained DNA molecules by flow cytometry. This method, however, employs costly and cumbersome equipment, and requires atomization of the nucleic acid solution into droplets, with the requirement that each droplet contains at most one analyte molecule. Furthermore, the flow velocities required for successful sorting of DNA fragments were determined to be considerably slower than used in conventional flow cytometry, so the method would require adaptations to conventional equipment. Sorting a usable amount (e.g., 100 ng) of DNA using such equipment would take weeks, if not months, for a single run, and would generate inordinately large volumes of DNA solution requiring additional concentration and/or precipitation steps.
Quake (59) relates to a single molecule sizing microfabricated device (SMS) for sorting polynucleotides or particles by size, charge or other identifying characteristics, for example, characteristics that can be optically detected. The invention includes a fluorescence activated sorter (FAS), and methods for analyzing and sorting polynucleotides by measuring a signal produced by an optically-detectable (e.g., fluorescent, ultraviolet or color change) reporter associated with the molecules. These methods and microfabricated devices allow for high sensitivity, no cross-contamination, and lower cost than conventional gel techniques. In one embodiment of the invention, it has been discovered that devices of this kind can be advantageously designed for use in molecular fingerprinting applications, such as DNA fingerprinting.
It is thus desirable to provide a method of rapidly analyzing and sorting differently sized nucleic acid molecules with high resolution, using simple and inexpensive equipment. In a microfabricated system, a short optical path length is desirable to reduce distortion and improve signal-to-noise of detected radiation. Ideally, sorting of fragments can be carried out using any size-based criteria.
The invention provides a molecular fingerprinting method and system, including for example microfabricated devices for sorting reporter-labeled polynucleotides or polynucleotide molecules by size.
An object of the present invention is a method for DNA fingerprinting using synthetic repeat polymorphisms.
An additional object of the present invention is a method for identifying the source of DNA in a forensic or medical sample.
A further object of the present invention is to provide an automated DNA profiling assay. This case be used, for example for DNA mapping, e.g. of BAC or YAC libraries.
An additional object of the present invention is to provide a kit for detecting synthetic repeat polymorphisms.
In accomplishing these and other objectives, the invention provides a method for molecular fingerprinting using a synthetic version of restriction fragment length polymorphism. The method includes choosing at random a short (20-50 bp) sequence of the polynucleotide that is a fixed distance away from a restriction site. This can be repeated any number of times for enhanced statistical discrimination, with different locations in the polynucleotide and different distances to a restriction site. Thus, a unique set of fragments can be generated, resulting in a fingerprint that can be obtained without relying on naturally occurring repeat sequences or restriction sites.
The method also provides for identification of a fingerprint in a sample. To identify a fingerprinted polynucleotide in a sample, an oligonucleotide (i.e. a short polynucleotide probe) is synthesized to complement the randomly chosen sequences. The probes are mixed with the sample along with nucleotide triphosphates and polymerase. The nucleotides can be fluorescently labeled. Through this technique a set of fluorescent strands of polynucleotide will be synthesized. Each complementary strand is cut with restriction enzymes to yield a polynucleotide of a fixed length. The polynucleotides can then be sized, either by gel electrophoresis or in a single molecule sizing device (SMS). One oligonucelotide probe derived from a references sample can be used, resulting in one complementary strand in a test sample containing matching sequences. If multiple oligonucleotides are designed, the reaction can be multiplexed and the different length fragments can be resolved into a multiple fragment fingerprint that can be compared to the standard or reference fingerprint. Preferably, a digestion is performed before enzyme/primer extension to prevent non-specific binding of primers. A six-base cutter (digestion enzyme) is particularly preferred to cut the sample into fragments of tens of thousands of base pairs. Alternatively, digestion after extension to fix the length can be performed.
A number of variations and modifications to this technique will be apparent to the practitioner of ordinary skill. For example, instead of using labeled nucleotides, complementary polynucleotides can be post-stained with an intercalating dye. Another variation is to use affinity purification to pull down the fragment of interest, i.e., using biotinylated oligonucleotides and streptavidin coated magnetic beads.
In a preferred embodiment, a microfabricated device is used for detecting or sorting the nucleotide fragments in a fingerprint based on size. The SMS device is fast, allowing analysis in as little as 10 minutes, and requires only femtograms of material, thus, the SMS device provides relatively high sensitivity without the need for PCR.
Mircofabricated Device. The device includes a chip having a substrate with at least one microfabricated analysis unit. Each analysis unit includes a main channel, having at one end a sample inlet, having along its length a detection region, and having, adjacent and downstream of the detection region, an outlet or a branch point discrimination region leading to a plurality of branch channels originating at the discrimination region and in communication with the main channel. The analysis unit also provides a stream of solution, preferably continuous, containing the molecules and passing through the detection region, such that on average only one molecule occupies the detection region at any given time. The level of reporter from each molecule is measured as it passes within the detection region. If desired, the molecule is directed to a selected branch channel based on the level of reporter.
In a preferred embodiment, the substrate is planar, and contains a microfluidic chip made from a silicone elastomer impression of an etched silicon wafer according replica methods in soft-lithography (11). In one embodiment, the channels meet to form a xe2x80x9cTxe2x80x9d (T junction). A Y-shaped junction, and other shapes and geometries may also be used. A detection region is typically upstream from the branch point. Molecules or cells are diverted into one or another outlet channel based on a predetermined characteristic that is evaluated as each cell passes through the detection region. The channels are preferably sealed to contain the flow, for example by fixing a transparent coverslip, such as glass, over the chip, to cover the channels while permitting optical examination of one or more channels or regions, particularly the detection region. In a preferred embodiment the coverslip is pyrex, anodically bonded to the chip.
Other devices such as electrophoresis chips may also be used. Exemplary devices are described in U.S. Pat. Nos. 6,042,709; 5,965,001; 5,948,227; 5,880,690; and 6,007,690.
Channel Dimensions. The channels in a molecular analysis device are preferably between about 1 xcexcm and about 20 xcexcm in width and between about 1 xcexcm and about 20 xcexcm in depth, and the detection region has a volume of between about 1 fl and about 1 pl. In a cell analysis device the channels are preferably between about 1 and 500 microns in width and between about 1 and 500 microns in depth, and the detection region has a volume of between about 1 fl and 100 nl. In preferred embodiments, the device includes a transparent (e.g., glass) cover slip bonded to the substrate and covering the channels to form the roof of the channels. The channels may be of any dimensions suitable to accommodate the largest dimension of the molecules to be analyzed.
Manifolds. A device which contains a plurality of analysis units may further include a plurality of manifolds, the number of such manifolds typically being equal to the number of branch channels in one analysis unit, to facilitate collection of molecules from corresponding branch channels of the different analysis units.
Flow of molecules. In one embodiment, the molecules are directed or sorted by electroosmotic force. A pair of electrodes apply an electric field or gradient across the discrimination region that is effective to move the flow of molecules through the device. In a sorting embodiment the electrodes can be switched to direct a particular molecule into a selected branch channel based on the amount of reporter signal detected from that molecule. In another embodiment, a flow of molecules is maintained through the device via a pump or pressure differential, and a valve structure can be used at the branch point effective to permit each molecule to enter only one selected branch channel. Alternatively, a valve can be placed in one or more channels downstream of the branch point to allow or curtail flow through each channel. In a related, pressure can be adjusted at the outlet of each branch channel effective to allow or curtail flow through the channel.
Optical Detection. Preferably the molecules are optically detectable when passing through the detection region. For example the molecules may be labeled with a reporter, for example a fluorescent reporter. The optically detectable signal can be measured, and generally is proportional to or is a function of a characteristic of the molecules, such as size or molecular weight. A fluorescent reporter, generating a quantitative optical signal can be used. Fluorescent reporters are known, and can be associated with molecules such as polynucleotides using known techniques.
In a preferred molecular fingerprinting embodiment, the reporter label is a fluorescently-labeled single nucleotides, such as fluorescein-dNTP, rhodamine-dNTP, Cy3-dNTP, Cy5-dNTP, where dNTP represents dATP, dTTP, dUTP or dCTP. The reporter can also be chemically-modified single nucleotides, such as biotin-dNTP. Alternatively, chemicals can be used that will react with an attached functional group such as biotin.
Sorting Molecules. In another aspect, the invention includes a method of isolating polynucleotides having a selected size. The method includes: a) flowing a continuous stream of solution containing reporter-labeled polynucleotides through a channel comprising a detection region having a selected volume, where the concentration of the molecules in the solution is such that the molecules pass through the detection region one-by-one, c) determining the size of each molecule as it passes through the detection region by measuring the level of the reporter, d) in the continuous stream of solution, diverting (i) molecules having the selected size into a first branch channel, and (ii) molecules not having the selected size into a second branch channel. Polynucleotides diverted into any channel can be collected as desired.
Flow Control In preferred embodiments, the concentration of polynucleotides in the solution is between about 10 fM and about 1 nM and the detection region volume is between about 1 fl and about 1 pl. The molecules can be diverted, for example, by transient application of an electric field effective to bias (i) a molecule having the selected size (e.g., between about 100 bp and about 10 mb) to enter one branch channel, and (ii) a molecule not having the selected size to enter another branch channel. Alternatively, molecules can be directed into a selected channel, based on size, by temporarily blocking the flow in other channels, such that the continuous stream of solution carries the molecule having the selected size into the selected channel. Pumps and valves may also be used to divert flow, and carry molecules into one or another channels, and mechanical switches may also be used. These methods can also be used in combination, and likewise molecules can be diverted based on whether they have a selected property or size, or do not have that property or size, or exceed or do not exceed a selected threshold measurement.
Synchronization. In each embodiment where molecules are measured and then diverted, as opposed to being measured only, the molecules are detected and measured one-by-one within the detection region, and are diverted one-by-one into the appropriate channels, by coordinating or synchronizing the diversion of flow with the detection step and with the flow entering the detection, as described for example in more detail below. In certain embodiments the flow rate may be adjusted, for example delayed, to maintain efficient detection and switching, and as described below the flow may in some cases be temporarily reversed to improve accuracy.
Sizing Molecules. In yet another aspect, the invention includes a method of sizing polynucleotides in solution. This method includes: a) flowing a continuous stream of solution containing reporter-labeled polynucleotides through a microfabricated channel comprising a detection region having a selected volume, where the concentration of the molecules in the solution is such that most molecules pass through the detection region one by one, and b) determining the size of each molecule as it passes through the detection region by measuring the level of the reporter.
Multiparameter Embodiments. In addition to analyzing or sorting fluorescent and non-fluorescent nucleotide fragments, the SMS can also provide multiparameter analysis. For example, sizing or sorting can be done according to a window or threshold value, meaning that molecules (e.g. polynucleotides) are selected based on the presence of a signal above or below a certain value or threshold. There can also be several points of analysis on the same chip for multiple time course measurements.
Thus, the invention provides for the rapid and accurate determination of the xe2x80x9cprofilexe2x80x9d of a polynucleotide in high resolution using minimal amounts of material in these simple and inexpensive microfabricated devices. The methods and devices of the invention can replace or be used in combination with conventional gel based approaches.