The ability to introduce genes into the germ line of organisms, for example mammals, is of great interest in biology. The propensity of mammalian cells to take up exogenously added DNA and to express genes included in the DNA has been known for many years. The results of gene manipulation are inherited by the offspring of these animals. All cells of these offspring inherit the introduced gene as part of their genetic make-up. Such animals are said to be transgenic.
Transgenic mammals have provided a means for studying gene regulation during embryogenesis and in differentiation, for studying the action of genes, and for studying the intricate interaction of cells in the immune system. The whole animal is the ultimate assay system for manipulated genes, which direct complex biological processes.
Transgenic animals can provide a general assay for functionally dissecting DNA sequences responsible for tissue specific or developmental regulation of a variety of genes. In addition, transgenic animals provide useful vehicles for expressing recombinant proteins and for generating precise animal models of human genetic disorders.
For a general discussion of gene cloning and expression in animals and animal cells, see Sambrook et al., Molecular Cloning: A Laboratory Manual, 3 ed., Cold Spring Harbor Laboratory Press, 2001, and Green et al., Genome Analysis: A Laboratory Manual Cold Spring Harbor Laboratory Press, 1997.
Transgenic lines, which have a predisposition to specific diseases and genetic disorders, are of great value in the investigation of the events leading to these states. It is well known that the efficacy of treatment of a genetic disorder may be dependent on identification of the gene defect that is the primary cause of the disorder. The discovery of effective treatments can be expedited by providing an animal model that will lead to the disease or disorder, which will enable the study of the efficacy, safety, and mode of action of treatment protocols, such as genetic recombination.
Homologous recombination (HR) between chromosomal and exogenous DNA is at the basis of methods for introducing genetic changes into the genome (Capecchi, Science 244: 1288-1292, 1989; Smithies et al., Nature 317: 230-234, 1985). Parameters of the recombination mechanism have been determined by studying plasmid sequences introduced into cells (Bernstein, et al., Mol. Cell. Biol. 12: 360-367, 1992; Brenner et al., Proc. Natl. Acad. Sci. USA. 83: 1762-1766, 1986; Lin et al., Mol. Cell. Biol. 10:113-119, 1990; Lin et al., Mol. Cell. Biol. 10: 103-112, 1990) and in in vitro system (Jessberger and Berg, Mol. Cell. Biol. 11: 445-457, 1991). HR is promoted by double-strand breaks in DNA.
Among endonucleases, the Saccharomyces cerevisiae mitochondrial endonuclease I-Sce I (Jacquier and Dujon, Cell 41: 383-394, 1985) has characteristics, which can be exploited as a tool for cleaving a specific chromosomal target and, therefore, manipulating the chromosome in living organisms (U.S. Pat. No. 5,474,896). I-Sce I protein is an endonuclease responsible for intron homing in mitochondria of yeast, a non-reciprocal mechanism by which a predetermined sequence becomes inserted at a predetermined site. It has been established that endonuclease I-Sce I can catalyze recombination in the nucleus of yeast by initiating a double-strand break (Plessis et al., Genetics 130: 451-460, 1992). The recognition site of endonuclease I-Sce I is 18 bp long, therefore, the I-Sce I protein is a very rare cutting restriction endonuclease in genomes (Thierry et al., Nucleic Acids Res. 19: 189-90, 1991). In addition, as the I-Sce I protein is not a recombinase, its potential for chromosome engineering is larger than that of systems with target site requirements on both host and donor molecules (Kilby et al., Reviews 9: 413-421, 1993).
The yeast I-Sce I endonuclease can efficiently induce double-strand breaks in a chromosomal target in mammalian cells and the breaks can be repaired using a donor molecule that shares homology with the regions flanking the break resulting in site-specific recombination, gene replacement, or insertion (U.S. Pat. No. 5,474,896). The enzyme catalyzes recombination at a high efficiency. This demonstrates that recombination between chromosomal DNA and exogenous DNA can occur in mammalian cells by the double-strand break repair pathway (Szostak et al., Cell 33: 25-35, 1983).
I-SceI has been used for many different applications. Such applications have involved the study of double-stranded breaks, the investigation of chromosome structure, the study of transposition, inducing gene replacement in mammalian and bacterial cells, gene targeting by homologous recombination in Drosophila, and the production of chromosomal breaks in plants. Anglana and Bacchetti, Nucl. Acids Res. 27: 4276-4281, 1999; Bellaiche et al., Genetics 152: 1037-1044, 1999; Choulika et al., CR Acad. Sci. III 317: 1013-1019, 1994; Choulika et al., Mol. Cell. Biol. 15: 1968-1973, 1994; Cohen-Tannoudji et al., Mol. Cell. Biol. 18: 1444-1448, 1998; Liang et al. and Garrard, Methods 17: 95-103, 1999; Machida et al., Proc. Natl. Acad. Sci. USA 94: 8675-8680, 1997; Melkerson-Watson et al., Infect. Immun. 69: 5933-5942, 2000; Mogila et al., Methods Mol. Biol. 113: 439-445, 1999; Monteilhet et al., Nucl. Acids Res. 18: 1407-1413; Nahon and Raveh, Adv. Exp. Med. Biol. 451: 411-414, 1998; Neuveglise et al., Gene 213: 37-46, 1998; Nicolas et al., Virology 266: 211-224, 2000; Perrin et al., Embo J. 12: 2939-2947, 1993; Posfai et al., Nucl. Acids Res. 27: 4409-4415; Puchta, Methods Mol. Biol. 113: 447-451, 1999; Rong et al., Science 288: 2013-2018; Thierry et al., Nucl. Acids Res. 19: 189-190; and A. Plessis et al., Genetics 130: 451-460, 1992.
Group I introns are widespread in many evolutionary phylums because of their efficient propagation mechanism. Some of them encode homing endonucleases, which recognize the intron insertion site in an intronless cognate DNA-sequence and introduce double-strand breaks in the DNA near that site. Afterwards, the intron-containing gene acts as template for the repair of the cleaved recipient allele, in a gene conversion process, which leads to the duplication of the intervening sequence (1-4). In contrast to the group I intron homing, group II intron mobility is based on a retrohoming mechanism promoted by the intron encoded protein bound to the intron lariat, forming a ribonucleoprotein (RNP) particle. The RNP particle results in intron integration into the DNA target site by reversed splicing and reverse transcription of the intron RNA (5). In addition to this, the protein component is endowed with endonucleolytic activity, which cuts the antisense strand. After the RNA is positioned on the DNA it integrates into the sense strand before the antisense strand is cleaved by the protein part (6). Thus, both intronic RNA and protein component of the RNP particle are involved in the recognition of the intron target site. The latter element is also essential for DNA-unwinding (7). But not only group I- and group II-introns undergo homing. Some DNA sequences encoding inteins, polypeptides that are postranslationally removed, propagate in the same manner described for group I-introns. Inteins contain endonucleases of the LAGLIDADG (SEQ ID NO: 17) family or of the H-N-H family (3, 8-12). It is likely, that these enzymes have evolved by invasion of an endonuclease gene into a preexisting intein carrying the protein splicing activity (13). Structural examinations on the crystals of the intein endonucleases PI-SceI and PI-PfuI strongly suggest that, in contrast to group I intron endonucleases, they use an additional DNA-binding domain to enhance their specificity. In PI-SceI, the DNA recognition region (DRR) establishes specific substrate contacts about two helical turns distant from the cleavage site (14), while in PI-PfuI the stirrup domain fulfills the same purpose (15).
LAGLIDADG (SEQ ID NO: 17) homing endonucleases produce 4 bp 3′-OH overhangs near the intron insertion site (16-18). Conditions for optimal activity depend on the enzyme. For example, I-SceII prepared out of mitochondria (19) prefers temperatures around 30° C. and neutral pH whereas I-DmoI (18,20) prefers temperatures around 70° C. and alkaline pH-values. Unlike bacterial Type II restriction enzymes, homing endonucleases must have a very high recognition sequence specificity to exclude noxious effects on the host genome because no cognate modification system exists. Therefore, their recognition sites are much longer (14-30 bp, up to 40 bp for some intein encoded endonucleases). As it has been shown for the crystallized enzymes I-CreI (21), PI-SceI (22), I-DmoI (23) and the His-Cys box horning endonuclease I-PpoI (24), intron encoded homing endonucleases rely on β-sheets to make their contacts with the DNA major groove. Hence their profile is very flat and they cover a wide area on the DNA (23,25), whereas the globular restriction endonucleases (26) usually interact via side chains from their α-helices with the target sequence (4). Known homing endonucleases were classified into four families depending on the occurrence of consensus motifs (LAGLIDADG (SEQ ID NO: 17), GIY-YIG, H-N-H and His-Cys box). The latter two groups are now classified on a structural basis into a single group, the ββα-Me group (27). Members of the bacterial type II restriction enzymes are more divergent in contrast to this. The endonucleases belonging to the LAGLIDADG (SEQ ID NO: 17) protein family are the most common representatives. The main characteristic of this class is a dodecapeptide motif, which occurs one or two times in the protein.
Endonucleases with one motif bind their substrate as homodimers, whereas the enzymes with two LAGLIDADG (SEQ ID NO: 17) motifs tend to act as monomers. Exceptions are I-SceII, encoded by intron αI4α of the cox1 gene in S. cerevisiae, and I-SceIV from intron cox1/5a of the same organism. I-SceII posseses two dodecapeptide motifs but is active as a homodimer (19). I-SceIV acts as a heterodimer (28). It was assumed that two-domain enzymes like I-SceI (29) or I-DmoI (18) arose from the one-domain homing endonucleases like I-CreI (17) and I-CeuI (30) by a gene duplication event (3, 4, 21, 23, 31).
Some proteins with two LAGLIDADG (SEQ ID NO: 17) motifs are involved in splicing of their intron RNA. They are termed maturases (32,33). Maturases act as cofactors and stabilize the catalytic core of the intronic RNA structure for the splicing event (34,35). Some dodecapeptide endonucleases also bear a latent maturase activity, which can be revealed by mutation of a few amino acids (36-38). Only few of them reveal both activities simultaneously, as it was reported for I-AniI (39,40) and I-ScaI (41-43).
In the mitochondrial cox1 gene of Schizosaccharomyces pombe up to 4 group I-introns were found (44). Two of them contain open reading frames encoding proteins of the dodecapeptide family (45, 46).
In summary, there exists a need in the art for reagents and methods for providing transgenic animal models of human diseases and genetic disorders. The reagents can be based on a restriction enzyme, especially with high specificity, its corresponding restriction site, and the gene encoding this enzyme. In particular, there exists a need for reagents and methods for replacing a natural gene or fragment thereof, with another gene or gene fragment that is capable of alleviating the disease, or is capable, by modifying the cell or animal, to offer molecular tools to study such diseases.