DNA fingerprinting methods have been used for detecting DNA markers in a variety of applications. Examples include detecting DNA markers linked to genetic traits, diagnostic markers for pathogen-borne diseases, forensic genotyping, parentage analysis, and molecular taxonomy. These methods apply to the entire DNA sample, with no specific focus on the functional regions of the DNA. For example, restriction fragment-length polymorphism (RFLP) and amplified fragment-length polymorphism (AFLP) methods rely on the sequences of restriction enzyme recognition sites. See, for example, Mueller & Wolfenbarger (1999) “AFLP Genotyping and Fingerprinting,” TREE, 14:389-394. These sites occur randomly throughout the human genome, including the intergenic and genic regions, and within the exons and introns of genes, without discrimination. These methods detect variations in DNA found randomly throughout the entire DNA sample, with no focus on the functional regions within a given genome. Similarly, random amplification of polymorphic DNA (RAPD) and DNA amplification fingerprinting (DAF) methods rely on arbitrary sequence primers whose complementary sequences also occur randomly in genomic DNA. See Welsh & McClelland (1990), Nucleic Acid Research, 18:7213; and Welsh, Petersen, & McClelland (1991) Nucleic Acid Research, 19:303.
It is estimated that approximately 98-99% of human and eukaryotic DNA is non-functional. Variations that occur within the non-functional regions of the genome are not useful for diagnosing or discovering gene defects, or informative variations or mutations. The functional regions of a genome, such as the exons, promoters, and poly A sites, constitute only slightly more than 1% of the human genome. However, current methods of genomic analysis do not specifically target these critically important functional regions of a genome. Thus there remains a long-felt and unmet meet for a method of analyzing, on a genome-wide level, those specific portions of a genome that encode functional DNA sequences.
The human genome harbors the genetic variations for a large number of Mendelian disorders. Many of these disorders have been localized in the genome through linkage studies, and the genes for these disorders are being isolated by different methods. The techniques currently used for isolating genes include: cDNA selection (Lovett, M., et al., Proc. Natl. Acad. Sci. USA, 88:9628-32 (1991)), exon trapping (Duyk, G. M., et al., Proc. Natl. Acad. Sci. USA, 87:8995-9 (1990)), CpG island identification (Estivill, X. and Williamson, R., Nucleic Acids Res., 15:1415-25 (1987)), hybridization using genomic fragments as probes against cDNA libraries (Rommerns, et al., Science, 245:1059-80 (1989)), cloning and sequencing of genomic DNA followed by computer analysis of the possible coding regions (Wilson, R., et al., Nature, 368:32-38 (1994)), Alu-splice PCR (Fuentes, J. J., et al., Hum. Genet. 101:346-50 (1997)), and Alu-promoter PCR (Jendraschak, E. and Kaminski, W. E., Genomics, 50:53-60 (1998)).
These techniques have several limitations. For example, many require analyzing large numbers of subclones to yield meaningful results. Both cDNA selection and hybridization using genomic fragments depend upon gene expression patterns using cDNA or mRNA libraries. Exon trapping requires specialized vectors and cell culture materials; whilst cDNA selection results only in enriching expressed sequences from a specific RNA source and requires much time and effort to determine the origin of the selected cDNAs. Alu-splice PCR also has limitations; it can only identify a few putative exons out of a larger number of true exons, even in a YAC clone. Because none of these methods permit the isolation of all the genes in a given region, usually several of the above methods are used in conjunction to complement one another, thereby achieving more complete results.
Furthermore, these methods are most usually only applied to DNA regions included in vectors such as yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), plasmids, and cosmids. They cannot be applied directly to whole genomic DNA to isolate a majority of the exons of genes contained in the genome. A method for isolating the majority of the flanking regions to a signal sequence, such as the 3′ or the 5′ splice junction or the promoters, present at numerous locations in a genome with a consensus sequence, would be very advantageous in a variety of genetic studies for discovering and treating major illnesses.
In essence, current methods for specifically amplifying exons present in an unknown genomic DNA are limited in their abilities. The isolation of only exon sequences from a gene will be advantageous for a variety of applications including comparative analysis between individuals. Attempts have been made to use the above methods to accomplish this purpose using genomic DNA fragments cloned into vectors.
For example, the Alu-splice PCR method attempts to isolate exon-containing fragments from cloned genomic DNA. This method utilizes the consensus sequence of splice junctions linked to a restriction enzyme recognition sequence as one primer and the consensus sequence of Alu repeat elements as the other primer to amplify any potential exon sequence that may be present between these primer binding sites in a cloned YAC DNA. However, this method has yielded poor results. For example, in one study, from a total of 128 colonies picked, only ten contained putative exons. Further, out of the few genes present in the two YACs analyzed, none of the nine exons present in one of the genes was isolated. Further still, most of the exons from among the five new genes that possibly existed in these YACs were not isolated except for one or two exons. From among the ten putative exon sequences isolated, six were shorter than 350 nucleotides. As the authors of this study agree, not all genes in a given sample will be identified by Alu-splice PCR, and not all the exons within a given gene will be identified by Alu-splice PCR. There are at least two reasons that explain this outcome: 1) the paucity of conveniently placed Alu repetitive elements; and 2) the limiting factor of specificity of the 5′ and 3′ splice-site primers; in the best of cases, primer specificity is only eight nucleotides. These inadequate results, even with a relatively short template DNA (YAC) compared to genomic DNA, indicate that this method is not applicable to isolate, in multiplex fashion, the exons of many genes from whole genomic DNA.