Methods have been described for the direct sequencing of genomic DNA which are based on polymerase chain reaction (PCR) [Wong, et al. (1987) and Engelke, et al. (1988)]. Genomic amplification with transcript sequencing (GAWTS), incorporates a phage promoter sequence into at least one of the PCR primers and is described in the parent application serial no. 149,312, filed Jan. 28, 1988.
In contrast to autosomal recessive mutations, deleterious X-linked mutations are eliminated within a few generations because the affected males reproduce sparingly if at all. Thus, each family in an X-linked disease such as hemophilia B represents an independent mutation. From the perspective of efforts to understand the expression, processing, and function of factor IX, this is useful since a large number of mutations are potentially available for analysis. In addition to facilitating structure-function correlations, the rapidity of GAWTS makes it practical to perform direct carrier testing and prenatal diagnosis of at risk individuals. By amplifying and sequencing 11 regions of the hemophilic factor IX gene which total 2.8 kb, it should be possible to delineate the causative mutation in the overwhelming majority of individuals as these regions contain the putative promoter, the 5xe2x80x2 untranslated region, the amino acid coding sequences, the terminal portion of the 3xe2x80x2 untranslated region, and the intron-exon boundaries. Once the mutation is delineated, GAWTS can be used to directly test an at-risk individual, thereby finessing the multiple problems associated with indirect linkage analysis.
Another aspect of the subject invention concerns a direct method for rapidly obtaining novel sequences from clones involving promoter ligation and transcript sequencing, and uses thereof.
The hallmark of the steroid/thyroxine/retinoic acid receptor gene superfamily is a pair of zinc binding xe2x80x9cfingersxe2x80x9d which determine the specificity of DNA binding [Evans, R. M., Science, 240:889-895(1988). Certain amino acids in the zinc finger DNA binding domains are highly conserved, and recent members of this gene family have been found in Drosophila by analyzing sequences that cross-hydridize in a low stringency Southern blot with a human retinoic acid receptor cDNA probe [Oro, A. E., E. S. Ong, J. S. Margolis, J. W. Posakony, M. McKeown, and R. M. Evans, Nature 336:493-496 (1988)]. The inventor has used the same approach to isolate members of the superfamily in fungi since steroid-specific, high affinity binding proteins have been described in the cytosols of Saccharomyces cervisiae, Paracoccidioides brasiliensis, and Candida albicans [Burshell, A., P. A. Stathis, Y. Do, S. C. Miller, and D. Feldman, J. Biol. Chem, 259:3450-3456 (1984); Feldman, D., Y. Do, A. Burshell, P. Stathis, and D. S. Loose, Science, 218:297-298 (1982); Loose, D. S., and D. Feldman, J. Biol. Chem, 257:4925-4930 (1982); and Loose, D. S., D. J. Schurman and D. Feldman, Nature, 293:477-479 (1981)]. In the water mold Achlya ambisexualis, the receptor for antheridiol (a steroid that regulates sexual physiology) was found to have many of the same properties of steroid receptors in higher eucaryotes [Reihl, R. M., D. O. Toft, M. D. Meyer, G. L. Carlson and T. C. McMorris, Exp. Cell Res. 153:544-549 (1984); and Reihl, R. M., D. O. Toft, J. Biol. Chem., 259:15324-15330 (1984)].
Since false positive signals commonly occur with low stringency Southern blots, the inventor has developed a method called promoter ligation and transcript sequencing (PLATS) to allow rapid analysis of cross-hybridizing segments by reducing the effort required to determine the precise sequence of the segment. In a broader sense, PLATS is a general method for obtaining novel sequence which eliminates lambda DNA purification and subcloning steps which are required by conventional methods. PLATS is illustrated by sequencing a 1.1 kb segment of Achlya ambisexualis which cross-hybridizes to the DNA binding domain of the Xenopus and chicken estrogen receptor. This segment contains a transcribed open reading frame which is not a member of the steriod/thyroxine/retinoic acid receptor superfamily. However, the inventor speculates that the Achlya gene product may belong to a novel class of transcriptional regulators that bind DNA with a zinc finger containing three cysteines and one histidine.
The ability to screen populations for carriers of genetic disease in an accurate, inexpensive, and rapid manner would provide the opportunity for widespread genetic counseling and, ultimately, the possible elimination of such diseases. A successful example of protein based carrier screening is Tay-Sachs disease (GM2 gangliosidosis type B), which is caused by a deficiency in xcex2-hexosaminidase activity. Since non-carrier and carrier levels of enzymatic activity do not overlap, genetic status can be unequivocally assigned. [Ben-Yoseph, U., J. E. Reid, B. Shapiro, H. L. Nadler., Am. J. Hum. Genet., 37:733-748 (1985)] Screening for Tay-Sachs has reduced markedly the incidence of this disease in Ashkenazi Jews. [O""Brien, J. S., the gangliosidases, In: Stanbury J. B., J. B. Wyngaarden, D. S. Fredrickson, J. L. Goldstein, M. S. Brown, eds. Metabolic Basis of Inherited Disease. New York: McGraw-Hill, 1983:945-969]. Unfortunately, measurements of protein or metabolite levels for other genetic diseases are not usually accurate enough for this type of population screening. Population screening may eventually be possible, however, with DNA-based methods.
Phenylketonuria (PKU) is one disease amenable to DNA-based screening. Classical PKU is an autosomal recessive disease affecting one in 10,000 newborn Caucasians of northern European descent. The disease is the result of a deficiency in hepatic phenylalanine hydroxylase activity (PAH), which causes a primary elevation of serum phenylalanine and secondary abnormalities in compounds derived from aromatic amino acids. [Blau, K. In: Yondim M B H, ed. Aromatic Amino Hydoxylases and Mental Diseases. New York: Wiley, 1979:79-139] If left untreated in infancy, severe mental retardation ensures. While treatment with a low phenylalanine diet can prevent mental retardation, the disease has not been rendered benign. Phenylketonurics still encounter problems, including: 1) failure to reach full intellectual potential due to incomplete compliance with the very stringent dietary therapy [Holtzman, N. A., R. A. Kronmal, W. Van Doorninck, C. Azen, R. Koch, New Engl. J. Med., 314:593-598 (1986)]; 2) a high frequency of birth defects in children of affected females [Scriver, C. R., C. L. Clow, Ann Rev. Genet., 14:179-202 (1980)]; and 3) a high incidence of behavioral problems. [Holtzman, et al., (1986); Realmuto, G. M., B. D. Garfinkel, M. Tuckman, M. Y. Tsai, P-N. Chang, R. O. Fisch, S. Shapiro., J. Nerv. Mental Dis., 174: 536-540 (1986)]
Subsequent to the cloning of PAH cDNA, [Kwok, S. C. M., F. D. Ledley, A. G. DiLella, K. J. H. Robson, S. L. C. Woo. Biochem., 24:556-561 (1985)] it was found that 90% of the PKU alleles in the Danish population are confined to four haplotypes. [Chakraborty, R., A. S. Lidsky, S. P. Daiger, F., Guttler, S. Sullivan, A. G. DiLella, S. L. C. Woo., Hum. Genet., 76: 40-46.(1987)] The mutations in haplotypes 2 and 3 represent 20% and 40% of the PKU alleles, respectively. The mutation in haplotype 2 is a C to T transition at amino acid 408 in exon 12 of the PAH gene [DiLella, A. G., J. Marvit, K. Brayton, S. L. C. Woo., Nature, 327:333-336. (1987)] and the mutation in haplotype 3 is a Gxe2x80x94A transition at the intron 12 donor splice junction. [DiLella, A. G., J. Marvit, A. S. Lidsky, F. Guttler, S. L. C. Woo., Nature, 322:799-803 (1986)] The mutant alleles associated with haplotypes 2 and 3 are also prevalent in the United States population. [Moore, S. D., W. M. Huang, R. Koch, S. Snyderman, S. L. C. Woo., Am. J. Hum. Genet., 43:A90 (1988)] When the mutations in haplotypes 1 and 4 are defined, 90% of all PKU carriers of northern European descent (approximately 4 million individuals in the United States alone) could be directly diagnosed by DNA methods.
The current methods which can detect such point mutations include: i) direct DNA sequencing, [Gyllensten, U. B., H. A. Erlich., Proc. Natl. Acad. Sci., 85:7652-7656 (1988)]; ii) denaturing gradient gel electrophoresis [Myers, R. M., N. Lumelsky, L. S. Lerman, T. Maniatas, Nature, 313:495-498 (1985)]; iii) polymerase chain reaction (PCR) followed by allele-specific oligonucleotide hybridization [DiLella, A. G., W-M. Huang, S. L. C. Woo., Lancet, 1:497-499 (1988)]; iv) allele specific DNA ligation [Landegren, U., R. Kaiser, J. Sanders, L. Hood, Science, 241:1077-1080 (1988)]; and v) ribonuclease cleavage of mismatched heteroduplexes. [R. M., Myers, Z. Larin, T. Maniatas. Science, 230:1242-1246 (1985)] However, these techniques in their present form are unlikely to find widespread application in population screening because they lack the requisite speed, technical ease, and/or cost effectiveness.
This invention extends GAWTS by providing a method for rapid and direct access to an mRNA sequence or its protein product which is not limited by either tissue or species specificity. In addition, this application provides a direct method for rapidly obtaining novel sequences from clones involving promoter ligation and transcript sequencing.
Lastly, the subject invention provides a method for polymerase chain reaction amplification of specific alleles to reliably distinguish between alleles differing in only part.
The subject invention provides a method of amplifying a sequence of interest present within a nucleic acid molecule which comprises:
A) obtaining a sample of the nucleic acid molecule which contains the sequence of interest;
B) if the nucleic acid molecule is a single-stranded RNA molecule, treating the sample from step (A) so as to prepare a sample containing a DNA molecule which contains a sequence complementary to the sequence of interest;
C) treating the sample from step (A) if the nucleic acid molecule is a DNA molecule or the sample from step (B) if the nucleic acid molecule is a single-stranded RNA molecule so as to obtain a further sample containing a single-stranded DNA molecule which contains a sequence complementary to the sequence of interest;
D) contacting the further sample from step (C) under hybridizing conditions with one oligonucleotide primer which includes at least (a) a promoter and (b) a nucleic acid sequence present within the nucleic acid molecule which contains the sequence of interest, which primer sequence is located adjacent to, and 5xe2x80x2 of, the sequence of interest, so that the oligonucleotide primer hybridizes with the single-stranded DNA molecule which contains the sequence complementary to the sequence of interest;
E) treating the resulting sample containing the single-stranded DNA molecule to which the oligonucleotide primer is hybridized from step (D) with a polymerase under polymerizing conditions so that a DNA extension product of the oligonucleotide primer is synthesized, which DNA extension product contains the sequence of interest;
F) treating the sample from step (E) so as to separate the DNA extension product from the single-stranded DNA molecule on which it was synthesized and thereby obtain single-stranded DNA molecules;
G) contacting the resulting sample from step (F) containing the single-stranded DNA molecule which contains the sequence complementary to the sequence of interest under hybridizing conditions, with one oligonucleotide primer, which includes at least (a) a promoter and (b) a nucleic acid sequence located adjacent to, and 5xe2x80x2 of, the sequence of interest, so that the oligonucleotide primer hybridizes with the single-stranded DNA molecule present in the sample which contains the sequence complementary to the sequence of interest;
H) treating the sample containing the single-stranded DNA molecule to which the oligonucleotide primer is hybridized from step (G) with a polymerase so as to synthesize a further DNA extension product containing the sequence complementary to the sequence of interest;
I) repeating steps (F) through (H), as desired;
J) contacting the sample from step (I) with an RNA polymerase which initiates polymerization from the promoter present, under polymerizing conditions, so as to obtain multiple RNA transcripts of each DNA extension product which contains the sequence complementary to the sequence of interest, thereby amplifying the sequence of interest.
The subject invention provides a second method which is a method of amplifying a sequence of interest present within a nucleic acid molecule which comprises:
A) obtaining a sample of the nucleic acid molecule which contains the sequence of interest;
B) if the nucleic acid molecule is a single-stranded RNA molecule, treating the sample from step (A) so as to prepare a sample containing a DNA molecule which contains a sequence complementary to the sequence of interest;
C) treating the sample from step (A) if the nucleic acid molecule is a DNA molecule or the sample from step (B) if the nucleic acid molecule is a single-stranded RNA molecule so as to obtain a further sample containing a single-stranded DNA molecule which contains a sequence complementary to the sequence of interest;
D) contacting the further sample from step (C) under hybridizing conditions with two or more oligonucleotide primers at least one of which includes at least (a) a promoter and (b) a nucleic acid sequence present within the nucleic acid molecule which contains the sequence of interest, which primer sequence is located adjacent to, and 5xe2x80x2 of, the sequence of interest, and at least one other of which includes a nucleic acid sequence complementary to a sequence present within the nucleic acid molecule which contains the sequence of interest, which primer sequence is located adjacent to, and 5xe2x80x2 of, the nucleic acid sequence complementary to the sequence within the nucleic acid molecule which contains the sequence of interest, so that at least one of the oligonucleotide primers hybridizes with the single-stranded DNA molecule present in the sample which contains the sequence complementary to the sequence of interest, and at least one other of the oligonucleotide primers hybridizes with the single-stranded DNA molecule which contains the sequence of interest;
E) treating the resulting sample containing the single-stranded DNA molecules to which the oligonucleotide primers are hybridized from step (D) with a polymerase under polymerizing conditions so that DNA extension products of the oligonucleotide primers are synthesized, some of which DNA extension products contain the sequence of interest and some of which DNA extension products contain the sequence complementary to the sequence of interest;
F) treating the sample from step (E) so as to separate the DNA extension products from the single-stranded DNA molecules on which they were synthesized and thereby obtain single-stranded DNA molecules;
G) contacting the resulting sample from step (F) containing the single-stranded DNA molecule which contains the sequence complementary to the sequence of interest under hybridizing conditions, with two or more oligonucleotide primers at least one which includes at least (a) a promoter and (b) a nucleic acid sequence located adjacent to, and 5xe2x80x2 of, the sequence of interest, and at least one other of which includes a nucleic acid sequence complementary to a sequence present within the nucleic acid molecule which contains the sequence of interest, which primer sequence is located adjacent to, and 5xe2x80x2 of, the nucleic acid sequence complementary to the sequence within the nucleic acid molecule which contains the sequence of interest, so that at least one of the oligonucleotide primers DNA molecule present in the sample which contains the sequence complementary to the sequence of interest, and at least one other of the oligonucleotide primers hybridizes with the single-stranded DNA molecule which contains the sequence of interest;
H) at least treating the sample containing the single-stranded DNA molecules to which the oligonucleotide primers are hybridized from step (G) with polymerase so as to synthesize further DNA extension products, some of which DNA extension products contain the sequence of interest and some of which DNA extension products contain the sequence complementary to the sequence of interest;
I) repeating steps (F) through (H), as desired;
J) contacting the sample from step (I) with an RNA polymerase which initiates polymerization from the promoter present, under polymerizing conditions, so as to obtain multiple RNA transcripts of each DNA extension product which contains the sequence complementary to the sequence of interest, thereby amplifying the sequence of interest.
Further the subject invention provides a method of determining the nucleotide sequence of a sequence of interest present within a nucleic acid molecule which comprises:
a) amplifying the amount of the sequence of interest present within a nucleic acid molecule;
b) if the sequence generated in step (a) is double-stranded, treating the molecule to generate single-stranded nucleic acid molecules;
c) determining the sequence of the single-stranded nucleic acid molecules of either step (a) or (b) thereby determining the nucleotide sequence of the sequence of interest.
The subject invention further comprises a method of determining an internal nucleotide sequence present within a nucleic acid molecule which contains promoters at both ends of the nucleic acid molecule which comprises:
a) cleaving the nucleic acid molecule under such conditions so as to generate fragments of the nucleic acid molecule;
b) if the fragments of the nucleic acid molecule do not have blunt ends, treating the fragments of the nucleic acid molecule so as to generate blunt ends;
c) ligating a promoter to the blunt end of a fragment of the nucleic acid molecule obtained in step (a) or (b);
d) amplifying a sequence of the fragment of the nucleic acid molecule containing the promoter obtained in step (c);
e) transcribing the amplified fragment of the nucleic acid molecule obtained in step (d); and
f) sequencing the transcript obtained in step (e) thereby determining an internal nucleotide
The subject invention also provides a method of determining the nucleotide sequence of sequences present within a nucleic acid molecule which are adjacent to areas of known sequence which comprises:
a) cleaving the nucleic acid molecule adjacent to the sequences of interest under conditions so as to generate fragments of the nucleic acid molecule which contain the sequences of interest;
b) if the fragments of the nucleic acid molecule do not have blunt ends, treating the fragments of the nucleic acid molecule so as to generate blunt ends;
c) contacting the fragments containing the sequences of interest obtained in step (a) or (b) with an oligonucleotide containing two different promoter sequences adjacent to each other by blunt end ligation under conditions such that the promoter sequence binds adjacent to the sequence of interest and it is unlikely that the fragment will bind a promoter at both ends;
d) transcribing the fragments containing the sequences of interest and promoter sequence obtained in step (c) using a polymerase specific to the 5xe2x80x2 promoter sequence;
e) degrading or removing the fragments which were generated in steps (a) and (b);
f) synthesizing a nucleic acid sequence complementary to the first sequence to be determined using a downstream primer specific for the known sequence adjacent to the first sequence to be determined;
g) amplifying the amount of fragments containing the sequence to be determined using a downstream primer specific for the known sequence adjacent to the second sequence to be determined and an upstream primer specific for the second promoter sequence;
h) transcribing the fragments containing the sequence of interest using a polymerase specific to the second promoter sequence;
i) sequencing using a downstream primer specific for the third known sequence.