Transposons are discrete mobile DNA segments that are common constituents of plasmid, virus, and bacterial chromosomes. These elements are detected by their ability to transpose self-encoded phenotypic traits from one replicon to another, or to transpose to a known gene and inactivate it. There are two types of transposons, ranging in size from about 750 to greater than 50,000 nucleotide base pairs. One type, known as the small insertion sequence or IS element, does not encode any known phenotypic traits. The other type encompasses relatively large units that do encode phenotypic traits such as antibiotic resistance (Plasmids and Transposons Environmental Effects and Maintenance Mechanisms; Edited by C. Stuttard and K. Rozee; Academic Press, New York; Pages 165-205). Transposons or transposable elements include a piece of nucleic acid bounded by repeat sequences. Active transposons encode enzymes that facilitate the insertion of the nucleic acid into DNA sequences.
In vertebrates, the discovery of DNA transposons, mobile elements that move via a DNA intermediate, is relatively recent (Radice, A. D., et al., 1994. Mol. Gen. Genet. 244, 606-612). Since then, inactive, highly mutated members of the Tc1/mariner as well as the hAT (hobo/Ac/Tam) superfamilies of eukaryotic transposons have been isolated from different fish species, Xenopus and human genomes (Oosumi et al., 1995. Nature 378, 873; Ivics et al. 1995. Mol. Gen. Genet. 247, 312-322; Koga et al., 1996. Nature 383, 30; Lam et al., 1996. J. Mol. Biol. 257, 359-366 and Lam, W. L., et al. Proc. Natl. Acad Sci. USA 93, 10870-10875).
Retrotransposons are naturally occurring DNA elements which are found in cells from almost all species of animals, plants and bacteria which have been examined to date. They are capable of being expressed in cells, can be reverse transcribed into an extrachromosomal element and reintegrate into another site in the same genome from which they originated.
Retrotransposons may be grouped into two classes, the retrovirus-like LTR retrotransposons, and the non-LTR elements such as human L1 elements, Neurospora TAD elements (Kinsey, 1990, Genetics 126:317-326), I factors from Drosophila (Bucheton et al., 1984, Cell 38:153-163), and R2Bm from Bombyx mori (Luan et al., 1993, Cell 72: 595-605). These two types of retrotransposon are structurally different and also retrotranspose using radically different mechanisms.
Unlike the LTR retrotransposons, non-LTR elements (also called polyA elements) lack LTRs and instead end with polyA or A-rich sequences. The LTR retrotransposition mechanism is relatively well-understood; in contrast, the mechanism of retrotransposition by non-LTR retrotransposons has just begun to be elucidated (Luan and Eickbush, 1995, Mol. Cell. Biol. 15:3882-3891; Luan et al., 1993, Cell 72:595-605). non-LTR retrotransposons can be subdivided into sequence-specific and non-sequence-specific types. L1 is of the latter type being found to be inserted in a scattered manner in all human, mouse and other mammalian chromosomes.
The L1 element (also known as a LINE) has been extremely successful at colonizing the human genome. Early approximations estimated that L1s are present at 100,000 copies in the human genome and comprise 5% of nuclear DNA (Fanning and Singer, 1987, Biochim Biophys Acta 910:203-121). However, recent studies suggest that as many as 850,000 L1s may exist in the human genome (Smit et al., 1996, Current Opinion in Genetics and Development). Most of these copies are truncated at the 5′ end and are presumed to be defective. Similar to full-length elements, the 5′ truncated copies are often flanked by short target site duplications (TSDs).
A 6.1 kb full-length L1 consensus sequence reveals the following conserved organization: a 5′ untranslated leader region (UTR) with an internal promoter; two non-overlapping reading frames (ORF1 and ORF2); a 200 bp 3′ UTR and a 3′ poly A tail. ORF1 encodes a 40 kd protein and may serve a packaging function for the RNA (Martin, 1991, Mol. Cell Biol. 11:4804-4807; Hohjoh et al., 1996, EMBO J. 15:630-639), while ORF2 encodes a reverse transcriptase (Mathias et al., 1991, Science 254:1808-1810). ORF1 and possibly ORF2 proteins associate with L1 RNA, forming a ribonucleoprotein particle. Reverse transcription by ORF2 protein may occur, resulting in L1 cDNAs, which are integrated into the genome (Martin, 1991, Curr. Opin. Genet. Dev. 1:505-508). Additionally, L1 elements are usually flanked by TSD's ranging from 7 to 20 bp. The full L1 and other non-LTR retrotransposons lack recognizable homologs of retroviral integrase, protease and RNase H. This group of elements employs a fundamentally different mechanism for transposition than the LTR-retrotransposons.
Some human L1 elements can retrotranspose (express, cleave their target site, and reverse transcribe their own RNA using the cleaved target site as a primer) into new sites in the human genome, leading to genetic disorders. For example, germ line L1 insertions into the factor VIII and dystrophin gene give rise to hemophilia A and muscular dystrophy, respectively (Kazazian et al., 1988, Nature 332:164-166; Narita et al., 1993, J. Clinical Invest. 91:1862-1867; Holmes et al., 1994, Nature Genetics 7:143-148), while somatic cell L1 insertions into the c-myc and APC tumor suppressor gene are implicated in rare cases of breast and colon cancer, respectively (Morse et al., Nature 333:87-90; Miki et al., 1992, Cancer Research 52:643-645). L1 retrotransposons account, directly or indirectly, for more than 30% of mammalian genomes by mass (Lander et al., 2001, Nature 409:860-921), by means of self-mobilization and trans-mobilization of Alu elements (Dewannieux et al., 2003, Nature Genet. 35:41-48). A full-length (about 6-kilobase) L1 consists of two open reading frames, ORF1 and ORF2, encode proteins for retrotransposition (Feng et al., 1996, Cell 87:905-916; Moran et al., 1996, Cell 87:917-927).
Thus, a highly active L1 element would be potentially useful as a tool for mammalian genetics.