It often is desirable in molecular biology to determine the relatedness of DNA segments. Such determinations at the nucleotide sequence level have many uses in the detection and molecular analysis of DNAs from different organisms and in the construction of physical and genetic maps. The most precise method for comparing segments of DNA is to determine the entire nucleotide sequence of each segment. For large DNA segments, sequencing becomes prohibitively time-consuming and expensive. Thus, at the present time, it is not practical to use extensive sequencing to compare DNA segments when the segments are large or when a large number of segments are being compared.
Restriction enzymes provide a tool to rapidly analyze DNA segments to obtain a limited amount of sequence information. Each restriction enzyme recognizes a specific sequence of DNA, normally four to eight nucleotide pairs in length, and cleaves DNA at or near this recognition sequence. Digestion of a DNA segment with a particular restriction enzyme thus generates a characteristic array of fragments. Typically, these fragments are separated according to length by electrophoresis through an appropriate gel matrix. The sizes of the fragments are dependent on the exact sequence recognized by the restriction enzyme and the spatial distribution of the recognition sequence within the DNA segment. Thus, cleavage of a DNA segment with a restriction enzyme indicates that a particular short recognition sequence is present; the number of fragments produced indicates how many times the recognition sequence occurs; and the sizes of the fragments indicate the distance, in nucleotides, between adjacent recognition sites.
The relatively simple steps involved in digesting DNA with restriction enzymes and in electrophoresing DNA fragments have made restriction-fragment analysis a routine method for characterizing and comparing DNA segments. If two segments of DNA have restriction fragments of the same length, then there is an increased likelihood that the segments are similar in sequence or overlapping. The greater the number of restriction fragments in common, the higher the probability that any two DNA segments are related. Two procedures have been described that demonstrate the utility of using restriction-fragment comparisons to determine the relatedness of a large number (5000-10,000) of DNA segments. These two procedures are the global mapping method described by Olson et al. [Proc. Natl. Acad. Sci. U.S.A. 83:7826-7830 (1986)] and the fingerprint mapping method described by Coulson et al. [Proc. Natl. Acad. Sci. U.S.A. 83:7821-7825 (1986)].
The first step in the global mapping method is to digest each DNA segment with a restriction enzyme or combination of restriction enzymes to generate a collection of restriction fragments. (In the example presented by Olson et al., each DNA segment was digested with a combination of HindIII and EcoRI to generate fragments with an average size of 1200 bp.) Each restriction digest is electrophoresed in a separate lane through an agarose gel in order to separate fragments according to length. The DNA restriction fragments are visualized by staining each gel with ethidium bromide and photographing the gel using ultra violet illumination. The size of each restriction fragment is determined by comparing its electrophoretic mobility with the mobilities of known size standards that were electrophoresed in a parallel lane of each gel. Thus, each DNA segment is characterized by a list of restriction fragment sizes. A data base is constructed that contains fragment-size lists for all the DNA segments being compared. With the aid of a computer program, the fragment-size lists are compared in a pairwise manner in order to determine the number of fragments of common size. DNA segments with a significant number of overlaps are considered to be related. In this manner related DNA segments spanning regions greater than 100,000 bp can be identified.
The Olson et al. procedure is referred to as a global mapping method because almost all the fragments produced in the restriction digest are used in the construction of the fragment-size lists. The inclusion of nearly all fragments requires the use of a separation method that can resolve fairly large fragments, such as electrophoresis through an agarose gel. Although the use of an agarose gel allows analysis of large fragments, the ability to discriminate and accurately size closely-spaced fragments on an agarose gel is somewhat limited. This problem is addressed in the fingerprint mapping method of Coulson et al. by reducing the size of the fragments being analyzed to approximately 1000 nucleotides or smaller. Fragments of this size can be resolved with single base resolution on a denaturing acrylamide gel.
In the fingerprint mapping method of Coulson et al., each DNA segment is first cleaved with a restriction enzyme that leaves a 5' overhang. The ends of these fragments are labeled by incubation with a DNA polymerase in the presence of a radioactive nucleotide. These radioactively-labeled fragments are then digested with a second restriction enzyme that cleaves quite frequently to generate fragments that are now fairly short in length (average size approximately 200 bp). Each collection of DNA fragments is then separated according to length by electrophoresis through a denaturing polyacrylamide gel. Although each sample may contain a large number of different fragments, only those fragments that have an end generated by cleavage with the first restriction enzyme are radioactively labeled. The locations of these labeled fragments on the gel are detected by autoradiography. The sizes of the detected fragments are determined by comparison to the mobilities of known size standards. As in the global mapping method, fragment-size lists are compared in order to determine which DNA segments are related. Coulson et al. were able to identify clusters of related DNA segments that spanned regions 35,000 to 350,000 bp in size.
The global mapping method, the fingerprint mapping method, and other similar methods use a fragment-size list to characterize the identity of each DNA segment being examined. Each fragment in the fragment-size list represents one bit of information that can be used in comparing the relatedness of DNA segments. One disadvantage of these methods is that the amount of information about each DNA segment is limited to the number of fragments in the fragment-size list. If the fragments could be differentiated in some other way besides just size, more information would be available for making comparisons. Increasing the information content of each fragment in the fragment-size list provides better discrimination in deciding which overlaps between DNA segments are significant.
Another disadvantage of both the global and fingerprint mapping methods is that a number of steps are required after electrophoresis in order to obtain digital information that can be used in making comparisons. In the global mapping method gels must be stained with ethidium bromide and photographed in order to record the location of each DNA fragment in the gel. In the fingerprint mapping method gels must be exposed to X-ray film and the X-ray film must be developed in order to obtain a record of the mobility of each DNA fragment. In both cases the photographs or autoradiograms must be analyzed in order to digitize the mobility information. These manual manipulations increase the time and effort required to perform the mapping procedures.