Retrotransposons are very abundant mobile components of the human genome and move from one site to another within the genome via an RNA intermediate and reverse transcription and often insert within genes. Retrotransposons are thus distinct from DNA transposons, which move directly at the level of DNA. In the human genome, retrotransposons outnumber DNA transposons. DNA transposons, however, also insert into genes.
The L1 element (also known as a LINE) has been extremely successful at colonizing the human genome. Early approximations estimated that L1s are present at 100,000 copies in the human genome and comprise 5% of nuclear DNA (Fanning and Singer, 1987, Biochim Biophys Acta 910:203-121). However, recent studies suggest that as many as 520,000 L1s may exist in the human genome and comprise 17% of the human genome. (Smit, 1999, Current Opinion in Genetics and Development).
Some human L1 elements can retrotranspose (express, cleave their target site, and reverse transcribe their own RNA using the cleaved target site as a primer) into new sites in the human genome, leading to genetic disorders. Germ line L1 insertions into the factor VIII and dystrophin gene give rise to hemophilia A and muscular dystrophy, respectively (Kazazian et al., 1988, Nature 332:164-166; Narita et al., 1993, J. Clinical Invest. 91:1862-1867; Holmes et al., 1994, Nature Genetics 7:143-148), while somatic cell L1 insertions into the c-myc and APC tumor suppressor gene are implicated in rare cases of breast and colon cancer, respectively (Morse et al., Nature 333:87-90; Miki et al., 1992, Cancer Research 52:643-645). Thus, L1 is a potential mutagen and L1 retrotransposition is mutagenic.
There is a profound ascertainment bias in genetic mutation analysis in general because longer PCR products may amplify less well than shorter ones. In addition, not all mutations that cause disease are mutations in coding regions. The vast majority of known mutations are present in coding regions (including a small number of mutations that are not strictly speaking in the coding regions such as splice junction mutations which are nevertheless easily discovered because they lie so close to exons). It is very difficult and costly to find the mutations that do not fall in coding regions. There is a need in the art for a more cost effective method to identify genetic mutations.