1. Field of the Invention
The present invention relates to nucleic acid clamps that are capable of binding to nucleic acids with high sequence specificity and affinity. The invention also relates to methods of using nucleic acid clamps for the protection and for the selective cleavage of nucleic acids, for the regulation of gene expression and for the treatment of disorders.
2. Description of the Background
Although the structure of the nucleic acid molecule has been known for over 40 years, nucleic acid sequencing has only recently become routine. One of the most important discoveries, which made nucleic acid sequencing practical, was the development of techniques for sequence specific cleavage of nucleic acids. In fact, many molecular biology techniques, including cloning, sequencing, polymorphic loci analysis, restriction profiling, involve nucleic acid cleavage as a required part of the overall protocol.
The most common sequence specific nucleic acid cleavage technique in use today is restriction endonuclease digestion. Restriction endonucleases are one part of a multi-enzyme restriction-modification system (R-M system) used by bacteria to defend itself against foreign nucleic acids. This system protects the organism by digesting foreign nucleic acids found within the cell. Entry of foreign nucleic acid is often the consequence of viral infection or phagocytosis. Restriction endonucleases recognize, bind and cleave a specific sequence, the recognition sequence, of four to eight nucleotide base pairs of double-stranded DNA. Bacterial genomes can be protected from digestion by the specific methylation of recognition sequences, for example, at adenine (A) or cytosine (C) residues by bacterial methylases. Recognition sequences in foreign DNA are generally not methylated and, therefore, can be efficiently cleaved.
The utility of restriction endonucleases to molecular biology has prompted a massive hunt for additional members of this enzyme family. As a result, more than 10,000 prokaryotes have been screened and around 2500 enzymes have been found exhibiting more than 200 distinct recognition sequences (Fasman, ed., CRC Practical Handbook of Biochemistry and Molecular Biology, CRC Press, Cleveland Ohio, 1990). Virtually every group of prokaryotes and at least one type of virus contains members that include a restriction endonuclease system.
The restriction-modification system, also referred to as the host-specificity (HS) system, modifies a bacterium's own DNA in a characteristic pattern and degrades or restricts those DNA molecules which lack that distinctive pattern. Degradation of unmodified foreign DNA effectively prevents infection of bacteria by episomes and viruses. Not surprisingly, all restriction-modification systems have at least two features in common, a restriction activity and a modification activity. Modification typically involves methylation of the 6-amino group of adenine residues N4 and C5 of cytosine residues. Sequences comprising the methylated bases are protected from cleavage from restriction endonuclease. As methylase inactivates recognition sites, but not the nuclease, protection of DNA requires methylation of every recognition sequence in a host genome. Some restriction endonucleases are not inhibited by methylation whereas others require methylation for cleavage. Such enzymes protect cells by cleaving foreign DNA which is already methylated.
Restriction activity typically involves at least three types of restriction endonucleases referred to as type I, type II or type III endonucleases. Type I and type III restriction endonucleases contain methylase and nuclease activities in one polypeptide whereas type II restriction endonucleases have no intrinsic associated methylation activity. Methylation activity resides in a distinct polypeptide which generally has the same sequence specificity as the endonuclease. Type II restriction endonucleases recognize a specific and symmetrical cleavage sequence of four to eight nucleotides and generally cut within this recognition sequence. The probability that a given restriction endonuclease will cleave at a site is approximately equal to four to the power of the length of the recognition sequence. Thus, a restriction endonuclease with a recognition site of 4, 5, 6, 7 or 8 base pairs will cleave on the average once every 256, 1024, 4096, 16384 or 65536 base pairs, respectively.
Restriction endonucleases generate double-stranded breaks of a DNA molecule. Typically, this will lead to cell death when repair of the break cannot take place. Unlike type I and type III endonucleases, type II restriction endonuclease is found in virtually every group of prokaryotes. Of the more than 400 type II restriction endonuclease discovered, most have recognition sites of about six base pairs in length. Only a few restriction enzymes have a recognition site of eight base pairs.
DNA methylases can be used to alter the apparent recognition specificity of restriction endonucleases. Unique cleavage specificities may be created in the laboratory by methylating DNA sequences which overlap the recognition site of a restriction endonuclease. These modified sequences are resistant to cleavage by restriction endonucleases. If the recognition sequence of a restriction endonuclease is degenerate, a methylase may be used to modify a subset of the recognition sequence. For example, Hinc II nuclease recognizes the sequences GTCGAC, GTCAAC, GTTGAC and GTTAAC. Taq I methylase methylates the sequence TCGA at the A residue. If DNA is initially methylated with Taq I methylase, those Hinc II recognition sequences containing TCGA, namely GTCGAC, will be resistant to subsequent cleavage by Hinc II.
Another type of methylation modification occurs at the boundaries of the recognition sequence of a restriction endonuclease and a methylase. For example, a Bam HI site (GGATCC) followed by GG or preceded by CC (i.e., GGATCCGG or CCGGATCC) is a Bam HI site which overlaps a Msp I methylase site (CCGG). Treatment of DNA with Msp I methylase followed by Bam HI endonuclease results in cleavage of all Bam HI sites except GGATCCGG and CCGGATCC. A third method to alter restriction cleavage sites involves the use of restriction endonuclease which only cleaves methylated sites. For example, DNA is first methylated with Taq I methylase that has a recognition site of TCGA. A DNA sequence comprising two or more concatenated Taq I sites (TCGATCGA) will form a Dpn I site (GATC) at the junction. Dpn I is specific only for methylated site and, thus, forms an effective cleavage specificity of two or more Taq sites.
Another method capable of altering the specificity of restriction endonuclease involves the use of PNA, also referred to as peptide nucleic acid or polyamide nucleic acid analog. PNAs are synthetic nucleic acid analogs which can hybridize to form double-stranded structures in a similar fashion as natural nucleic acids. When a PNA strand only contains thymine and cytosine and it can hybridize to homopurine DNA to form a double-strand structure. A second PNA strand can hybridize to this double-strand to form a triple-stranded structure.
Lacking charge and directionality, PNA can bind to DNA in both the sense or antisense orientation. PNA.sub.2 -DNA or PNA.sub.2 -RNA has sufficient stability to specifically interfere with DNA recognizing proteins such as methylases, endonucleases, polymerases and transcription factors. PNA may be used to block specific sites on DNA during restriction methylation. After methylation the PNA may be removed and the previously protected sites may be digested with restriction endonuclease. Unprotected sites, methylated by the methylase, will be resistant to attack by nuclease.
Intronic nucleases, while capable of cutting DNA with a reduced frequency are not useful as their choice of cleavage sites are difficult to predict. Some genes of mitochondrial, chloroplast and nuclear DNA and the T-even bacteriophages contain introns that encode endonucleases. The highly degenerate recognition sites for these endonucleases have not been precisely determined. However, the effective cutting frequency is typically one cut per about one million to about 16 million base pairs. The considerable degeneracy of intronic nucleases cleavage sites reduces their general utility for many techniques in molecular biology.
Recent advances in molecular techniques, such as pulsed field electrophoresis and yeast artificial chromosomes, have made the analysis of DNA molecules of up to several million base pairs possible. The ability to physically manipulate large DNA molecules have created new opportunities for molecular biology and placed new demands on restriction endonucleases with a cutting frequency in the range of about one in a million to one in a billion or more.