Gene targeting generally refers to the directed alteration of a specific DNA sequence in its genomic locus in vivo. This may involve the transfer of genetic information from a nucleic acid molecule, which may be referred to as a gene targeting substrate, to a specific target locus in the host cell genome. In current methods, the gene targeting substrate usually exists as an extrachromosomal nucleic acid molecule. The target locus may be present in the host cell's nuclear chromosomes or organellar chromosomes (e.g. mitochondria or plastids) or a cellular episome. The gene targeting substrate typically encodes sequences homologous to the target locus. However, the sequence of the gene targeting substrate is modified to encode changed genetic information, vis-a-vis the target genetic locus, through the insertion or deletion of one or more base pairs or by the substitution of one or more bases for other types of bases. As a result, the gene targeting substrate may encode, for example, a different gene product than the target locus or a nucleic acid sequence which is non-functional or that functions differently than the nucleic acid sequence encoded by the target locus.
The process of gene targeting may involve the action of host nucleic acid recombination and repair functions. The homology between the target locus and the gene targeting substrate, in combination with host cell functions, is thought to facilitate the process of the gene targeting substrate “scanning” the host genome to find and associate with the target locus. Host nucleic acid recombination and repair functions may then act to transfer genetic information from the gene targeting substrate to the target locus by the processes of homologous recombination or gene conversion. In this manner, the novel sequence of the gene targeting substrate is transferred into the host genome at the targeted locus, which may result in loss of the wild-type genetic information at this locus. The modified target locus may now be stably inherited through cell divisions and, if present in germ cells and gametes, to subsequent progeny resulting from sexual reproduction.
This ability to perform precise genetic modifications of a host cell's genome at defined loci is an extremely powerful technology for basic and applied biological research. A principal advantage of gene targeting over conventional transformation technologies, which results in integration of the exogenously supplied DNA cassettes at random sites in the host genome is the maintenance of appropriate chromosomal context for the modified gene. In contrast, transformational integration of DNA cassettes into random sites of the host genome can have large negative effects on the host cell by causing insertional inactivation of the resident gene where the DNA cassette integrates, for example. In addition, integration at random sites can affect expression of the introduced gene encoded by a cassette. Such ‘position effects’ may result from epigenetic control of gene expression relating to the regulation of chromatin conformation (Mlynarova, L, et al., 1996, Plant Cell 8, pp. 1589-1599). Thus transgenes which integrate at random sites in the genome may not be expressed in the correct fashion to accurately reflect the biological effect of the gene under basic study, or provide the desired phenotype in a biotechnology application. Targeting of a transgene to its correct native site in the host genome may help to ensure correct epigenetic regulation of its expression.
Gene targeting may enable the accurate analysis of the phenotypic effects of modified genes by simultaneously replacing the endogenous gene copy. In contrast, placement of a transgene encoding a modified version of an endogenous gene at random sites in the genome may not enable accurate analysis of the effect of this transgene because the endogenous gene copy is still functioning. Expression of the endogenous gene copy may compensate for or impair the action of the gene product encoded by the transgene. Through gene targeting, the endogenous gene copy may be replaced by the introduced modified gene. As a result, the endogenous gene copy will not be able to interfere with the action of the introduced modified gene and an accurate interpretation of the biological effects of the modified gene may be possible. This ability is important for accurate assessment of gene function in basic studies, and is important for biotechnology applications aimed at modifying the physiological, biochemical or developmental paths and responses of cells and organisms.
Through gene targeting a non-exclusive list of possible modifications or combinations of modifications to the host genome includes:
1. Gene replacement and gene addition: by replacing the targeted chromosomal gene or genes, or promoter or promoters, or portions of the aforementioned, with another gene or genes, or promoter or promoters, or portions of the aforementioned; or adding a gene or genes and regulatory components, or portions thereof, at a targeted chromosomal locus adjacent to resident endogenous loci.2. Gene inactivation and gene deletion: Inactivating a targeted chromosomal gene through disruption of transcription or translation by changing the sequence composition or by inserting or deleting one or more base pairs of the gene sequence. Furthermore, the coding region or regulatory components, or portions thereof, of a targeted chromosomal gene or genes may be deleted as required.
Using gene targeting, an absolute inactivation of specified target genes may be possible by, for example, creating insertion, deletion or substitution mutations in the target genes. Thus the phenotypic effects of the gene may be assessed by studying the engineered null-mutant. This null-mutant may also be genetically stable in subsequent generations ensuring the continued propagation of this line maintaining the same engineered phenotype. The modified line may also be isogenic to the original cell line or organism from which it is derived thus enabling reliable and accurate comparisons between the modified and original lines so that the effects of the modification may be accurately determined. Targeted gene inactivation may therefore have advantages over conventional means of gene silencing, such as antisense RNA and cosuppression, which may not provide absolute inactivation of the target gene and/or may not cause a stable and consistent level of inactivation through generations.
3. Allele modification: Changing the sequence of a targeted chromosomal gene to create a new allele which encodes a protein with a changed amino acid composition (i.e. protein engineering), or which has modified translatability or stability of the transcript.
Gene targeting has been demonstrated in several species including lower eukaryotes, invertebrate animals, mammals, lower plants and higher plants. Gene targeting substrates include single-stranded DNA (ssDNA; Simon J. R., Moore, P. D., 1987, Mol Cell Biochem 7, pp. 2329-2334), double-stranded DNA (dsDNA; Rothstein, R, 1991, Methods Enzymol. 194: 281-301), or hybrid molecules with RNA and DNA constituents. For some prior DNA-based gene targeting substrates, the amount of homology to the target locus present in the gene targeting substrate has varied from 10's of basepairs (bp) to 10's of kilobasepairs (kb; Yang, X W, et. al., 1997, Nat. Biotechnol. 15, pp. 859-865), depending upon the nature of the target locus and the type of host cell or species and the efficiency of homologous recombination functions in that host cell or species. For RNA/DNA hybrid gene targeting substrates, the homology in some cases has been 10's of basepairs (for example see Zhu, T, 2000, Nat. Biotechnol. 18: 555-558; Beetham, P. R., 1999, Proc. Natl. Acad. Sci. U.S.A 96: 8774-8778).
Successful gene targeting has been achieved by treatment of cultured cells, tissues or organisms with gene targeting substrate. This has resulted in modified target loci which are stable through cell divisions. However, the frequency of these events is low. To obtain modified target loci stably transmissible through sexual reproduction in mammals, specialized procedures employing specific embryonic stem cell lines may be employed. In other animal systems, gene targeting substrates may be injected into gonads, or gene targeting substrate may be engineered to be present in the cells at early developmental stages to ensure modification of germ line cells. Conversely, in some plants the totipotency of all cells may enable nearly any modified cell line to be regenerated into intact plants capable of transmitting the modified locus to progeny.
Application of gene targeting methods, especially in plants and mammals, may be inhibited by several limitations in conventional technology, which may be technically demanding, rely on tedious and expensive in vitro procedures, or be successful only in specialized cell lines. These limitations may be compounded by a low frequency of gene targeting events which may not be easily identifiable. In some applications, only target loci which when modified result in selectable or easily screenable phenotypes may be employed, so that the rare gene targeting events may be identified.
Conventional gene targeting strategies may rely on incorporation of a selectable marker at the target locus resulting in insertional-inactivation mutants by interruption of the target gene with the selectable marker, an approach that may not enable more subtle modifications such as single base-pair changes. Current selection and enrichment procedures may also be ineffective if they select false-positives with high frequency.
A principal factor affecting the frequency of gene targeting with some conventional approaches may be the mechanism of delivering gene targeting substrate to the host cells. Current procedures typically produce a gene targeting substrate exogenously and rely on various means, including chemical treatments, physical treatments, or biological vehicles, to get the gene targeting substrate into the host cell and nucleus. Such methods require extensive screening since the frequency of modifying the target locus is low, and background levels of insertion at non-target loci is high. Methods have accordingly been proposed to address this perceived problem, such as methods disclosed in U.S. Pat. No. 6,504,081 for transposon-mediated gene targeting which purportedly enhance the insertion and detection of desired genes in genomic exons.
International Patent Publication WO02/062986, published 15 Aug. 2002, describes a replicative gene targeting system that renews or regenerates a gene targeting cassette using various mechanisms of DNA replication, to enable repeated cycles of gene targeting substrate production in vivo. As disclosed therein, successive rounds of gene targeting cassette replication may allow the accumulation of multiple molecules of gene targeting substrate per cell or nucleus, so that the presence of more gene targeting substrate may result in a higher frequency of gene targeting events to produce heritable changes in a target host sequence.
Retrons have been known for some time as a class of retroelement, first discovered in gram-negative bacteria such as Myxococcus xanthus, Stigmatella aurantiaca and Escherichia coli. Retrons mediate the synthesis in host cells of multicopy single-stranded DNAs (msDNA), which typically include a DNA component and an RNA component. The native msDNA molecules reportedly exist as single-stranded DNA-RNA hybrids, characterized by a structure which comprises a single-stranded DNA branching out of an internal guanosine residue of a single-stranded RNA molecule at a 2′,5′-phosphodiester linkage. Native retrons have been found to consist of the gene for reverse transcriptase (RT) and an msr-msd region under the control of a single promoter. The msd region typically codes for the DNA component of msDNA, and the msr region typically codes for the RNA component of msDNA. In some retrons, the msr and msd genes have overlapping 3′ ends, and are oriented opposite one another with a promoter located upstream of msr which transcribes through the msd-msr region. The msd-msr region generally contains two inverted repeat sequences, designated “a” and “b”, which together make up a stable stem structure in msDNAs. The single RNA transcript from the msr-msd region serves not only as a template for reverse transcription but, by virtue of its secondary structure, also serves as a primer for msDNA synthesis by a reverse transcriptase.
Retrons have been suggested for use in a variety of applications, including production of polypeptides and anti-sense inhibition of target genes, see for example U.S. Pat. No. 5,849,563; U.S. Pat. No. 6,017,737; U.S. Pat. No. 5,849,563; U.S. Pat. No. 5,780,269; U.S. Pat. No. 5,436,141; U.S. Pat. No. 5,405,775; U.S. Pat. No. 5,320,958; and CA 2,075,515.