1. Field of the Invention
The present invention relates to recombinant DNA technology and in particular to a method of detecting frameshift mutations or assuring an in-frame coding sequence in a nucleic acid sequence. The present invention also provides a vector for use in the method.
2. Description of Related Art
Changes in the reading frame of a gene including additions or subtractions of nucleotides, i.e. frameshift, generally leads to termination of translation and/or formation of truncation products often through generation of new stop codons and can also be referred to as an out-of-reading-frame mutation. For example, cystic fibrosis, Duchenne muscular dystrophy, fragile X, Huntington's disease, Alzheimer's disease (Hardy and Duff, 1993), Ataxia Telangiectasia, Marfan syndrome, neurofibromatosis, familial adenomatous polyposis coli (Varesco et al, 1993) and osteogenesis imperfecta are diseases that can result from a frameshift mutation in a particular susceptibility gene.
Identification of such mutations can be undertaken utilizing RFLP, in situ hybridization, Southern blotting, single strand conformational polymorphism, PCR amplification and DNA-chip analysis using specific primers. (Kawasaki, 1990; Kahn et al., 1995; Lichter et al, 1990; Marwood et al., 1995; Orita et al, 1989; Fodor et al., 1993; Pease et al., 1994; U.S. Pat. No. 5,545,531; PCT applications WO98/28444). The methods now available fall into six classes: electrophoretic mobility alteration methods such as single-strand conformational polymorphism (SSCP), restriction enzyme fingerprinting (REF); mismatch cleavage methods; mismatch recognition methods; direct sequencing methods and protein truncation tests (see Genome Analysis Volume 2, pages 288-289 for a more detailed listing).
In general these methods require sophisticated and expensive equipment and in several instances require that the frameshift mutations be known so that appropriate primers or chip sequences can be prepared. Additionally, most of these methods require amplification by PCR with the inherent problems of PCR as discussed herein below. Direct sequencing does not require that the frameshift mutation be known but does generally require automated sequencing equipment and skilled technical support. Further detection of systemic errors are also needed (Fichant and Quentin, 1995; Claverie, 1993). These methods do not lend themselves to rapid, inexpensive screening or scanning particularly for new out-of-reading-frame mutations. Additionally, several of these techniques, such as SSCP, tend to provide an unacceptably high false positive rate. For example, a polymorphism which causes what can be termed a neutral change in the DNA code and without pathogenic consequences will be identified even though it is not a frameshift.
The Varesco et al, 1993 reference provides a vector system to detect specific frameshift mutations in the APC gene consisting of a promotor, out-of-frame insert and .beta.-galactosidase as the reporter gene. However this method is probably limited to screening for known mutations in familial adenomatous polyposis coli (APC) due to the construction of the vector. Further, the detection of .beta.-galactosidase as a reporter gene under the selected promotor requires subjective qualitative differentiation so that even with a frameshift in place it is possible to obtain false results. Further as indicated in the reference there is no consistency between plates. Additionally, the presence of a frameshift produces an intermediate color which can be difficult to score.
Therefore a method is needed for rapidly and reliably detecting out-of-reading-frame mutations such as deletions, missense, nonsense and stop codons in nucleic acid sequences that does not suffer from the limitations of the methods described above. It would be useful to have a method for positively selecting among samples those containing coding sequences which have a correct reading frame that is rapid, simple and inexpensive so that susceptibility genes for both known and de novo mutations can be screened.
Polymerase chain reaction (PCR)-based approaches are becoming increasingly important for the identification of members of extended multigene families as well as homologous gene structures present in phylogenetically divergent species (Bozdech et al., 1996; Kim et al., 1996; Rast et al., 1994; Yoshihara et al., 1994). Many of these approaches rely on the use of highly degenerate primers and/or reduced priming stringencies that can generate a broad range of products, including significant numbers of amplification products that contain frameshift(s) and/or termination codon(s) which are referred to as amplification artifacts. Recently, Applicants introduced the use of short, minimally degenerate primers complementing conserved structural motifs for PCR amplification of homologs of antigen binding receptor genes in phylogenetically diverse species (Partula et al., 1995; Rast et al., 1994, 1995, 1997). This approach is also associated with the generation of amplification artifacts that require DNA sequencing to be distinguished from products that warrant further study. In order to analyze DNA sequences, polymerase chain reaction (PCR) is routinely used. However, in addition to the problems associated with the techniques listed herein, amplification artifacts are sometimes found in PCR products which usually result from errant priming of non-coding sequences which have multiple stop codons, but also can change an open reading frame (in-frame) to an out-of-frame sequence or the converse. It would therefore be useful to have a method to rapidly screen PCR products to ensure open reading frame continuity, that the PCR amplification had not introduced these types of errors. Direct sequencing of the PCR products can be undertaken to determine this, but it would be useful to have a more rapid, less expensive, screening method.