The field of this invention is molecular biology, particularly in the area of retrotransposons, nucleotide sequence encoding integrase therefrom, and molecular genetic methods based thereon. In particular, the present invention relates to the insertion of heterologous DNA into eukaryotic genomes at specific locations and to protein domains targeting insertion at particular genomic sites.
Retroelements, which include the retroviruses and retrotransposons, insert cDNA copies of themselves into the host genome as part of their replication cycle. The selection of integration sites is not random. While a target bias for the retroviruses is not apparent from the genomic distribution of insertions, they often show local target preferences and tend to integrate into transcriptionally active or DNase I hypersensitive regions [Sandmeyer et al. (1990)]. This suggests that integration sites are not determined by a preference for specific DNA sequences. Rather, target choice is likely mediated by higher order structural features of the target sites (e.g. chromatin) [Curcio and Morse (1997)].
Target biases are clearer for the retrotransposons, particularly those of Saccharomyces cerevisiae. The Ty3 elements preferentially integrate upstream of genes transcribed by RNA polymerase III (pol III), usually within 1-2 bases of transcription start sites [Chalker and Sandmeyer (1993); Chalker and Sandmeyer (1992)]. The use of in vitro Ty3 transposition assays has shown that loading of transcription factors TFIIIB and TFIIIC onto tRNA gene promoters is sufficient for targeting [Kirchner et al. (1995)]. Ty1 elements typically integrate within a one kb window upstream of genes transcribed by pol III, and pol III transcription is required for Ty1 target choice [Devine and Boeke (1996)]. The data for both Ty1 and Ty3, therefore, indicate that targeting occurs as a consequence of interactions between a component of the retrotransposon integration complex and a host factor localized to sites of pol III transcription. HIV integrase has been found to interact by two-hybrid assays with a transcription factor homolog called ini1 [Kalpana et al. (1994)], implying that selection of target sites through interactions with chromosomal proteins may be a general feature of retroelements. Ini1, however, is not known to target HIV integration.
The yeast Ty5 retrotransposons integrate almost exclusively into telomeric regions and near the silent mating loci HMR and HML [Zou et al. (1996a); Zou et al. (1995)]. These regions are bound in unique chromatin, called silent chromatin, which represses the expression of adjacent genes and plays a role in telomere maintenance [Laurenson and Rine (1992)]. A large number of factors make up silent chromatin, including proteins involved in DNA replication (ORC), transcription factors (RAP1, ABF1) and silent information regulatory proteins (SIR2-SIR4). Mutations in cis-acting sequences that disrupt the assembly of silent chromatin at HMR also abolish Ty5 integration to this locus [Zou and Voytas (1997)]. This indicates that silent chromatin directs Ty5 target choice.
Despite the evidence that retroelements select target sites through interactions with chromosome-localized proteins, neither retroelement nor specific host factors required for targeted integration have been identified heretofore. In contrast to Ty1 and Ty3, for which there are multiple genomic targets (e.g. 274 tRNA genes; http://genome-www.stanford.edu/Saccharomyces), the number of known Ty5 targets are limited to the 32 telomeres and the two silent mating loci. To identify Ty5-encoded proteins required for targeting, we took advantage of this limited number of targets to devise an assay that monitors the frequency of integration to a single plasmid-borne locus. This assay was used to screen for Ty5 mutations that disrupt targeting, and as a consequence, a region near the C-terminus of integrase was identified as the determinant of target choice.