This invention relates to methods and compositions for manipulating and characterizing individual polymer molecules, especially nucleic acid molecules, according to, for example, size and/or nucleotide sequence.
The analysis of nucleic acid molecules at the genome level is an extremely complex endeavor which requires accurate, rapid characterization of large numbers of often very large nucleic acid molecules via high throughput DNA mapping and sequencing. The construction of physical maps, and ultimately of nucleotide sequences, for eukaryotic chromosomes currently remains laborious and difficult. This is due, in part, to the fact that current procedures for mapping and sequencing DNA were originally designed to analyze nucleic acid at the gene, rather than at the genome, level (Chumakov, I. et al., 1992, Nature 359:380; Maier, E. et al., 1992, Nat. Genet. 1:273).
Traditionally, the separation and molecular weight distribution of nucleic acid molecules has been accomplished, most commonly, via gel electrophoresis (see, for example, Freifelder, 1976, Physical Biochemistry, W. H. Freeman), which involves moving a population of molecules through an appropriate medium, such that the molecules are separated according to size. Such electrophoretic methods offer an acceptable level of size resolution, but, especially for purposes of high throughput mapping, suffer from a number of setbacks.
For example, such techniques require the preparation of DNA in bulk amounts. First, with respect to genome mapping, such preparative procedures may require sources such as genomic DNA or DNA from yeast artificial chromosomes (YACs; Burke, D. T. et al., 1987, Science 236:806; Barlow, et al., 1987, Trends in Genetics 3:167-177; Campbell et al., 1991, Proc. Natl. Acad. Sci. USA 88:5744). Obtaining quantities of DNA from these sources which is sufficient for detailed analyses, such as restriction mapping, is time consuming and often impractical. Second, because populations of molecules of like size migrate through the medium at the same rate, it is impossible to separate individual molecules from within a sample of particles by utilizing such a technique. Additionally, while it is possible to resolve a wide size range of DNA molecule populations gel electrophoresis techniques, optimal techniques can often require the use of several different gel matrix compositions and/or alternative electrophoresis procedures, depending upon the sizes of the molecules of interest. For example, the separation of large molecules of DNA may require such techniques as pulse field electrophoresis (see, e.g., U.S. Pat. No. 4,473,452). Further, standard gel electrophoresis techniques involve the separation of populations of molecules according to size, making it impossible to separate individual molecules within a polydisperse mixture. In summary, therefore, the accurate, rapid, practical, high throughput separation of individual DNA molecules, especially those of highly disparate sizes, which would often be required for genomic mapping purposes, is impossible via gel electrophoresis.
Techniques have been reported for the visualization of single nucleic acid molecules and complexes. Such techniques include such fluorescence microscopy-based techniques as fluorescence in situ hybridization (FISH; Manuelidis, L. et al., 1982, J. Cell. Biol. 95:619; Lawrence, C. A. et al., 1988, Cell 52:51; Lichter, P. et al., 1990, Science 247:64; Heng, H. H. Q. et al., 1992, Proc. Natl. Acad. Sci. USA 89;9509; van den Engh, G. et al., 1992, Science 257:1410) and those reported by, for example, Yanagida (Yanagida, M. et al., 1983, Cold Spring Harbor Symp. Quantit. Biol. 47:177; Matsumoto, S. et al., 1981, J. Mol. Biol. 132:501-516); tethering techniques, whereby one or both ends of a nucleic acid molecule are anchored to a surface (U.S. Pat. Nos. 5,079,169; 5,380,833; Perkins, T. T. et al., 1994, Science 264:819; Bensimon, A. et al., 1994, Science 265:2096); and scanning probe microscopy-based visualization techniques, including scanning tunneling microscopy and atomic force microscopy techniques (see, e.g., Karrasch, S. et al., 1993, Biophysical J. 65:2437-2446; Hansma, H. G. et al., 1993, Nucleic Acids Research 21:505-512; Bustamante, C. et al., 1992, Biochemistry 31:22-26; Lyubchenko, Y. L. et al., 1992, J. Biomol. Struct. and Dyn. 10:589-606; Allison, D. P. et al., 1992, Proc. Natl. Acad. Sci. USA 89:10129-10133; Zenhausern, F. et al., 1992, J. Struct. Biol. 108:69-73).
While single molecule techniques offer the potential advantage of an ordering capability which gel electrophoresis lacks, none of the current single molecule techniques can be used, on a practical level, as, for example, high resolution genomic mapping tools. The molecules described by Yanagida (Yanagida, M. et al., 1983, Cold Spring Harbor Symp. Quantit. Biol. 47:177; Matsumoto, S. et al., 1981, J. Mol. Biol. 132:501-516), for example, were visualized, primarily free in solution, in a manner which would make any practical mapping impossible. Further, while the FISH technique offers the advantage of using only a limited number of immobilized fragments, usually chromosomes, it is not possible to achieve the sizing resolution available with gel electrophoresis.
Single molecule tethering techniques, as listed above, generally involve individual nucleic acid molecules which have, first, been immobilized onto a surface via one or both of their ends, and, second, have been manipulated such that the molecules are stretched out. These techniques, however, are not suited to genome analysis. First, the steps involved are time consuming and can only be accomplished with a small number of molecules per procedure. Further, in general, the tethered molecules cannot be stored and used again.
A combination of the sizing capability of gel electrophoresis and the ordering capability of certain single molecule techniques such as, for example, FISH, would, therefore, be extremely useful for genomic analyses such as genomic mapping. Such analyses would be further aided by the ability to manipulate the single molecules being analyzed. Additionally, an ability to reuse the nucleic acid samples of interest would increase the efficiency and throughput capability of the analysis. Currently, however, there exists no single technology which embodies, in a practical manner, each of these elements.
Citation of documents herein is not intended as an admission that any of the documents cited herein is pertinent prior art, or an admission that the cited documents are considered material to the patentability of the claims of the present application. All statements as to the date or representations as to the contents of these documents are based on the information available to the applicant and does not constitute any admission as to the correctness of the dates or contents of these documents.
The present invention relates to methods and compositions for characterizing and manipulating individual nucleic acid molecules, including mammalian chromosome-sized individual nucleic acid molecules. The methods and compositions described herein can be utilized for the accurate, rapid, high throughput analysis of nucleic acid molecules at the genome level, and may, for example, include the construction of high resolution physical maps, referred to herein as xe2x80x9coptical mappingxe2x80x9d, and the detection of specific nucleotide sequences within a genome, referred to herein as xe2x80x9coptical sequencing.xe2x80x9d
Specifically, methods are described whereby single nucleic acid molecules, including mammalian chromosome-sized DNA molecules, are elongated and fixed in a rapid, controlled and reproducible manner which allows for the nucleic acid molecules to retain their biological function and, further, makes rapid analysis of the molecules possible. In one embodiment of such a procedure, the molecules are elongated in a flow of a molten or unpolymerized gel composition. The elongated molecules become fixed as the gel composition becomes hardened or polymerized. In such an embodiment, the gel composition is preferably an agarose gel composition. The elongated molecules became fixed as the agarose.
In a second embodiment, the single nucleic acid molecules are elongated and fixed in a controllable manner directly onto a solid, planar surface. This solid, planar surface contains a positive charge density which has been controllably modified such that the single nucleic acid molecules will exhibit an optimal balance between the critical parameters of nucleic acid elongation state, degree of relaxation stability and biological activity. Further, methods, compositions and assays are described by which such an optimal balance can precisely and reproducibly be achieved.
In a third embodiment, the single nucleic acid molecules are elongated via flow-based techniques. In such an embodiment, a single nucleic acid molecule is elongated, manipulated (via, for example, a regio-specific restriction digestion), and/or analyzed in a laminar flow elongation device. The present invention further relates to and describes such a laminar flow elongation device.
The elongated, individual nucleic acid molecules can then be utilized in a variety of ways which have applications for the analysis of nucleic acid at the genome level. For example, such nucleic acid molecules may be used to generate ordered, high resolution single nucleic acid molecule restriction maps. This method is referred to herein as xe2x80x9coptical mappingxe2x80x9d or xe2x80x9coptical restriction mappingxe2x80x9d. Additionally, methods are presented whereby specific nucleotide sequences present within the elongated nucleic acid molecules can be identified. Such methods are referred to herein as xe2x80x9coptical sequencingxe2x80x9d. The optical mapping and optical sequencing techniques can be used independently or in combination on the same individual nucleic acid molecules.
Still further, the elongated nucleic acid molecules of the invention can be manipulated using any standard procedure. For example, the single nucleic acid molecules may be manipulated by any enzymes which act upon nucleic acid molecules, and which may include, but are not limited to, restriction endonucleases, exonucleases, polymerases, ligases or helicases.
Additionally, methods are also presented for the imaging and sizing of the elongated single nucleic acid molecules. These imaging techniques may, for example, include the use of fluorochromes, microscopy and/or image processing computer software and hardware. Such sizing methods include both static and dynamic measuring techniques.
Still further, high throughput methods for utilizing such single nucleic acid molecules in genome analysis are presented. In one embodiment of such high throughput methods, rapid optical mapping approaches are described for the creation of high-resolution restriction maps. In such an embodiment, single nucleic acid molecules are elongated, fixed and gridded to high density onto a solid surface. These molecules can then be digested with appropriate restriction enzymes for the map construction. In an alternative embodiment, the single nucleic acid molecules can be elongated, fixed and gridded at high density onto a solid surface and utilized in a variety of optical sequencing-based diagnostic methods. In addition to speed, such diagnostic grids can be reused. Further, the high throughput and methods can be utilized to rapidly generate information derived from procedures which combine optical mapping and optical sequencing methods.
The present invention is based on the development of techniques, including high throughput techniques, which reproducibly and rapidly generate populations of individual, elongated nucleic acid molecules that not only retain biological function but are accessible to manipulation and make possible rapid genome analysis.