Traditionally, the molecular weight distribution of a sample of particles has been determined by measuring the rate at which particles which are subjected to a perturbing force move through an appropriate medium, e.g., a medium which causes the particles to separate according to size. A mathematical relationship is calculated which relates the size of particles and their migration rate through a medium when a specified force is applied.
Sedimentation is a well-known technique for measuring particle size, but, when applied to polymers, this method is limited to molecules with a maximum size of about 50-100 kilobases (kb). Attempting to measure larger molecules by this technique would probably result in underestimation of molecular size, mainly because the sedimentation coefficient is sensitive to centrifuge speed. See Kavenoff et al., Cold Spring Harbor Symp. Quantit. Biol., 381 (1974)).
Another popular method of separating polymer particles by size is by gel electrophoresis (see, e.g., Freifelder, Physical Biochemistry, W. H. Freeman (1976), which is particularly useful for separating restriction digests. In brief, application of an electric field to an agarose or polyacrylamide gel in which polymer particles are dissolved causes the smaller particles to migrate through the gel at a faster rate than the larger particles. The molecular weight of the polymer in each band is calibrated by a comparison of the migration rate of an unknown substance with the mobility of polymer fragments of known length. The amount of polymer in each band can be estimated based upon the width and/or color intensity (optical density) of the stained band. However, this type of estimate is usually not very accurate.
Pulsed field electrophoresis, developed by the present inventor and described in U.S. Pat. No. 4,473,452, which is hereby entirely incorporated herein by reference, is an electrophoretic technique in which the separation of large DNA molecules in a gel is improved relative to separation using conventional electrophoresis. According to this technique, deliberately alternated electric fields are used to separate particles, rather than the continuous fields used in previously known electrophoretic methods. More particularly, particles are separated using electric fields of equal strength which are transverse to each other, which alternate between high and low intensities out of phase with each other at a frequency related to the mass of the particles. The forces move the particles in an overall direction transverse to the respective directions of the fields. It should be noted here that the term "transverse" as used herein is not limited to an angle of, or close to, 90.degree., but includes other substantial angles of intersection.
One of the most significant problems with determining the weight of molecules by indirect measurement techniques, such as those described above, is that the parameters which are directly measured, e.g., migration rate, are relatively insensitive to small differences in molecular size. Thus, a precise determination of particle size distribution is difficult to obtain. The lack of precision may particularly be a problem when biological polymer samples, which tend to be unstable and contain single molecules inches in length, are involved.
While some of the known methods of determining particle size distribution in a polydisperse sample provide better resolution than others, few, if any, of the previously known techniques provide resolution as high as is needed to distinguish between particles of nearly identical size. Gel permeation chromatography and sedimentation provide resolution of only about M.sup.1/2 (M=molecular weight). Standard agarose gel electrophoresis and polyacrylamide gel electrophoresis provide resolution varying as -log M. Pulsed electrophoretic techniques are effective for separating extraordinarily large molecules, but do not provide much better resolution than standard electrophoresis. Thus, the ability to distinguish between particles of similar size, for example, particles differing in length by a fraction of percent, is inaccurate and problematic using the above-described measurement techniques.
Particles of higher mass (i.e., up to approximately 600 kb) can be resolved using conventional gel electrophoresis by reducing the gel (e.g., polyarylamide) concentration to as low as 0.035% and reducing field strength. However, there are also problems with this method. Most notably, The dramatic reduction in gel concentration results in a gel which is mechanically unstable, and less sample can be loaded. An electrophoretic run to resolve very large DNA molecules using a reduced gel concentration and field strength may take a week or more to complete. Furthermore, a reduced gel concentration is not useful to separate molecules in a sample having a wide range of particle sizes, because separation of small molecules is not achieved. Thus, if a sample containing molecules having a wide range of sizes is to be separated, several electrophoretic runs may be needed, e.g., first, a separation of the larger molecules and then further separation of the smaller molecules.
Other particle measurement techniques known in the art are useful for sizing certain molecules which are present in a bulk sample, (e.g., the largest molecules in the sample, or the average molecular size) but are impractical for measuring many polymers of varying length in a given sample. The viscoelastic recoil technique, (see Kavenoff et al, "Chromosome-sized DNA molecules from Drosophila," Chromosoma 411 (1973)) which is well known in the art, involves stretching out coiled molecules in a solvent flow field (e.g., a field which is created when fluid is perturbed between two moving plates) and determining the time required for the largest molecule to return to a relaxed state. Relaxation time is measured by watching the rotation of a concentric rotor which moves during the time of relaxation.
While this technique is quite precise in that sample determinations vary as M.sup.1.66 when applied to large DNA molecules, it is not useful for sizing molecules other than the largest molecule in the sample.
Using light scattering techniques, which are known in the art, (e.g., quasi-elastic light scattering), the size and shape of particles are determined by a Zimm plos, a data analysis method which is known in the art. With these techniques, size dependence varies as M.sup.1. Light scattering requires that the solution in which the molecules to be measured are placed is pure, that is, without dust or any other contamination, and it is therefore unsuitable for sizing a DNA sample. Furthermore, it is not useful for sizing molecules as large as many DNA molecules, and is useful only for determining the average weight of particles in a sample, not the weight distribution of a sample with particles of various sizes.
Yet another particle measuring technique which is known in the art for measuring individual molecules provides measurements of particle size having limited accuracy. The average size and shape of individual, relaxed DNA molecules has been determined by observing the molecules under a fluorescence microscope, and measuring the major and minor axes of molecules having a spherical or ellipsoid shape (see Yanagida et al, Cold Spring Harbor Symp. Quantit. Biol. 47,177, (1983)). This technique is performed in a free solution, without perturbation of the molecules.
The movement of small DNA molecules during electrophoresis has been observed (see Smith et al. Science. 243203 (1989)). The methods disclosed in this publication are not suitable for observation of very large DNA molecules, and techniques for measuring molecules are not discussed.
Practical weight determinations of particles such as polymer molecules depend not only upon maximizing the size dependencies of the directly measured parameters, but also upon factors such as the amount of sample needed, the time required to complete an analysis, and the accuracy of measurements. Gel permeation chromatography can be time-consuming and requires a large amount of sample. Methods such as conventional gel electrophoresis can be relatively time-consuming, require moderate amounts of sample, and cannot size very large DNA molecules.
Molecular sizing is a fundamental operation that touches virtually every aspect of genomic analysis from DNA sequencing to size measurements of lower eucaryotic chromosomal DNAs. Molecular size, given in kilobases, can be translated into centimorgans for many organisms, and vice-versa; and gel electrophoresis is generally used to determine these sizes. The basics in nucleic acid sizing technology, as practiced by the typical molecular biologist, have not changed very much in the past decade. This is understandable considering the simplicity of gel electrophoresis and its capacity for parallel processing of multiple samples. The data obtained from gels are readily interpretable. Given the size of most genes, gel electrophoresis techniques adapt well to their analysis. From characterization of restriction digests to discernment of one base differences in sequencing ladders, gel electrophoresis is the method of choice for size analysis of DNAs. Even the outcome of PCR (Mullis, Methods in Enzymol, 155335-350 (1987)) reactions is frequently monitored by sizing analysis. Pulsed gel electrophoresis extends this coverage even further to include chromosomal DNAs from lower eucaryotes (Schwartz, Cold Spring Harbor Symp. Quant. Biol., 4789 (1983); Schwartz Cell, 3767 (1984); Carle, Nucleic Acids. Res. 125647 (1984); Chu, Science 2341582 (1986); Clark, et al., Science 2411203 (1988). Because pulsed electrophoresis can resolve very large DNA molecules, its application has simplified the mapping of large genomes and provided a necessary tool for creating large YACs (yeast artificial chromosomes) (Barlow, et al., Trends in Genetics 3167-177 (1987); Campbell, et al. Proc Natl. Acad. Sci. 885744 (1991)). However, pulsed electrophoresis was developed more than 10 years ago (Schwartz, Cold Spring Harbor Symp. Quant. Biol. 4789 (1983) and Schwartz Cell, 3767 (1984)). The surprising lack of significant sizing advances is contrasted to progress made in understanding the molecular mechanisms of conventional and pulsed electrophoresis (Zimm, Quart. Rev. Biophys. 25171 (1992); Deutsch, Science 240992 (1988).
Although molecular size determination has not advanced significantly in this decade, another aspect of genomic analysis, DNA detection technology, has progressed to a remarkable extent. These developments have impacted on gel-based methodologies as well as on the field of cytogenetics. A driving force has been the Human Genome Initiative and its goals to characterize the human genome and the genomes of model organisms by extensive mapping and sequencing. The new goals aimed at analyzing entire large mammalian genomes include increasing accuracy and high throughput of DNA lapping and sequencing. The first round of needed advances has come in part from a combination of sophisticated image processing methods (Glazer, Nature 359859 (1992); Quesada, BioTechniques 10616 (1991); Mathies, Nature 359167 (1992)); new DNA detection techniques and new DNA labeling/imaging systems (Glazer, Proc. Natl. Acad. Sci. 873851 (1990); Beck, Nucleic Acids Res. 175115 (1989)). Automation of gel electrophoresis based technologies demands clear, relatively unambiguous detection systems for operator-free function (Lehrach, Cold Spring Harbor Press, Cold Spring Harbor, N.Y. pp39-81 (1989); Larin, Proc Natl. Acad. Sci., 884123 (1991)). Sophisticated computational methods can extract usable data automatically from difficult conditions. A good example of a fully integrated approach to mapping comes from the Cohen laboratory which has combined all of these technological approaches together with "mega-YACs" (Bellanne-Chantelot, et al., Cell 70L1059 (1992)) to maximally boost output to a dramatic extent, although with problems inherent in the fidelity of these YACs (Anderson, Science 2591684 (1993)).
Construction of physical maps for eucaryotic chromosomes is laborious and difficult, in part because many of the current methodologies for mapping and sequencing DNA were originally designed to analyze genes rather than genomes, so that at present there is a premium on automating procedures such as PCR and blot hybridizations (Chumakov, Nature 359380 (1992)). Two techniques have played a fundamental role in the process of ordering and sizing DNA sequences from eucaryotic chromosomes. Electrophoretic methods have the advantage of good size resolution, even for long chains, but require DNA in bulk amounts. Sources include genomic DNA or YACs (Burke, Science 236806 (1987)). Single molecule techniques, such as fluorescence in-situ hybridization or (FISH), utilize only a limited number of chromosomes (Manuelidis, J. Cell. Biol. 95L619-625 (1982)) but have not yet attained a sizing capability comparable to that of pulsed electrophoresis. Ideally, one would like to be able to combine the sizing power of electrophoresis with the intrinsic loci ordering capability of FISH in order to construct accurate restriction maps very rapidly.
All considered, the evolution of various physical and genetic techniques has enabled far more to be accomplished than expected toward creation of a complete, physical map of whole chromosomes and the entire human genome (Bellanne-Chantelot, et al., Cell 70L1059 (1992); Chumakov, et al., Science (1992);
Mandel, et al., Science 258103 (1992)). Despite this progress the situation can be improved in the following areas.
For fingerprinting YACs, chromosomal DNA is digested with several enzymes and then blotted and sometimes hybridized with several different repetitive sequences (Bellanne-Chantelot, et al., Cell 70L1059 (1992); Stallings, et al., Proc. Natl. Acad. Sci. 876218 (1990); Ross, et al., Techniques for the Analysis of Complex Genomes, Academic Press, Inc., San Diego, Calif., (1992)). Here, electrophoresis is used to size restriction fragments that are specifically identified by hybridization. The data density available for such an analysis is relatively low. For example, it is difficult to discern more than 100 bands in a given land in a typical agarose gel. Additionally, restriction fragments that are the same size cannot be resolved from each other and can only be discerned by careful, differential hybridization. Therefore, the fingerprint does not report nearly as much information as what would result if an ordered restriction map were to be made with the same enzyme(s) or even an accurate histogram of the size population. Such a histogram can only be obtained from gels by difficult measurements of band fluorescence intensities.
Gels are time-consuming. It takes time and care to pour gels and minutes to days to run, and it can take several days to do Southern analysis, although gels offer the opportunity for parallel sample analysis and, with multiplexing techniques (Church, et al., Science 240185 (1988)), this tremendous ability is probably maximized, sizing results are often difficult to digitize and to automatically tabulate.
Electrophoretic size resolution for commonly run agarose gels rarely exceeds mass. Although under limited conditions greater size resolution can be obtained (Calladine, Journal of Molecular Biology 221981 (1991)). Greater size resolution would enable simpler fingerprints with a higher information content. Although pulsed electrophoresis techniques can, under certain circumstances, boost size resolution, these results can be hard to interpret except in very narrow size ranges. Ultimately, these measured sizes are dependent on size markers which are limited in range for very large DNA molecules. For pulsed electrophoresis, the determined size is frequently inadequately interpolated between several size markers.
(iv) Usable sensitivity is limited to the subpicogram range except by exotic techniques (Glazer, et al., Nature 359859 (1992); Quesada, BioTechniques 10616 (1991)). However, now common phosphor imager systems have improved sensitivity some and make quantitation easier. The usable sensitivity range will dictate the type of sample that can be analyzed. For example, single-copy mammalian genomic hybridizations can be challenging to a novice. Mapping of end-labeled partial digestion of genomic DNAs is often not successful because of loss of attending sensitivity (Smith, et al., Nucleic Acids Res. 32387 (1976)) so that extensive analysis is difficult to do with genomic DNA samples. This necessitates the reliance on cloned genomic material, despite their limitations including problems with uncloneable regions, rearrangements and deletions. Although YACs enable cloning of such large genomic fragments and have served as the basis for many mapping approaches, they are not perfect mapping reagents and must therefore be used with great caution (Anderson, Science 2591684 (1993); Vollrath, et al., Science 25852 (1992); Foote, et al., Science 25860 (1992)).
Citation of documents herein is not intended as an admission that any of the documents cited herein is pertinent prior art, or an admission that the cited documents are considered material to the patentabilty of the claims of the present application. All statements as to the date or representations as to the contents of these documents are based on the information available to the applicant and does not constitute any admission as to the correctness of the dates or contents of these documents.