A genomic DNA "library" is formed by digesting genomic DNA from a particular organism with a suitable restriction enzyme, joining the genomic DNA fragments to vectors and introducing the DNA fragment-containing vectors into a population of host cells. Complementary DNA (cDNA) is DNA which has been produced by an enzyme known as reverse transcriptase which can synthesize a complementary strand of DNA (cDNA) using a mRNA strand as a template. A cDNA library is formed by joining the cDNA fragments to vectors and introducing the cDNA fragment-containing vectors into a population of host cells.
In a DNA or cDNA library, the pieces of DNA exist as an unordered collection of thousands or millions of pieces. To isolate a host cell carrying a specific DNA sequence (i.e., a specific DNA clone), the entire library must be screened. Radioactively labeled or otherwise labeled nucleic acid probes are traditionally employed to screen a DNA or cDNA library. Nucleic acid probes identify a specific DNA sequence by a process of in vitro hybridization between complementary DNA sequences in the probe and the DNA clone.
A specific DNA clone that has been identified and isolated in this manner can contain DNA that is contiguous to the probe sequence. A terminus of the DNA clone, therefore, can be used as a new probe to rescreen the same or another DNA library to obtain a second DNA clone which has an overlapping sequence with the first DNA clone. By obtaining a set of overlapping DNA clones, a physical map of a genomic region on a chromosome may be constructed. This process is called "chromosome walking" because each overlapping DNA clone which is isolated is one step further along the chromosome. Each DNA clone also can be studied to determine its genetic relationship to a previously mapped genetic function and, thus, a series of overlapping DNA clones provides a physical map of a chromosome which may be correlated to a map of genetic functions.
Chromosome walking is used, for example, to identify or localize a gene of interest, such as one thought to be causative of or associated with a disease or other condition, phenotype or quantitative trait. This is done by using a DNA fragment which displays a restriction fragment length polymorphism (RFLP) shown to be genetically linked to (i.e., physically localized to the same chromosome region as) a gene which causes or is associated with a disease, or other condition, phenotype, or quantitative trait or a segment of DNA contiguous to such a RFLP or a cDNA, as an in vitro hybridization probe to screen a DNA library and pull out larger fragments of DNA in which all or part of the probe sequence is represented.
The usefulness of any DNA clone isolated in this manner is that it includes DNA that is contiguous to the RFLP sequence that is incrementally closer to the position of the sought-after gene than the original RFLP. To get a step closer, a labeled molecule corresponding to an end of the newly isolated DNA clone is prepared and used to rescreen the library, with the goal being to isolate DNA clones that overlap with sequences found in the first DNA clone and that are incrementally closer to the gene of interest than either the starting probe or the first DNA clone isolated. This procedure is repeated as needed, with the resulting DNA clones being used in genetic studies to assess whether they are more closely linked to the gene of interest. To walk over a distance of 10 million base pairs using presently-available chromosome walking techniques could require from 100 to 2,000 steps, depending on the DNA cloning vector system used. Any approach designed to decrease the work required to take a single walking step or which would allow multiple walking projects to be carried out simultaneously would be a major advance.
The number of DNA clones which would be required to form a complete library of genomic DNA is determined by the size of the genome and the DNA clone capacity of the vector used to clone and propagate the segments of the genomic DNA. Construction and screening of genomic DNA libraries of organisms with large genomes is labor intensive and time consuming. The development of vectors having a capacity for large DNA clones has helped to reduce the labor involved in screening genomic libraries. However, screening libraries remains time consuming and labor intensive.