As well as exogenous retroviruses such as Human Immunodeficiency Viruses (HIV 1 and 2) which cause AIDS and Human T-Cell Leukaemia Viruses (ATLV/HTLV I and II), a cause of adult T-cell leukaemia and Tropical Spastic Paraparesis, there is also a large group of endogenous retroviruses (ERVs). Exogenous viruses undergo all stages of the replication cycle, including the production of the next generation of infectious viral particles, and can spread horizontally. Endogenous retroviruses are in fact DNA copies of the viral genome integrated into host DNA as a result of a germ line infection which may have occurred many millions years ago, and they are transmitted vertically. Most ERVs are apparently inactive as documented by their identical localisation in the genomes of humans and some primates. Some are present in one or a few copies, more frequently in hundreds, sometimes in thousands or more copies. However, they are not the only representatives of so called repetitive sequences in mammalian genomes. As much as 10% of the mouse and human genomes appear to consist of the products of reverse transcription, such as processed pseudogenes, SINEs, LINEs, Puppys and ERVs. In the human genome, ERVs can account for 0.1-0.6%.
There is an increasing body of evidence that repetitive sequences are not simply junk DNA. Human endogenous retroviruses (HERVS) have been implicated in oncogenesis but also in autoimmune disorders. HERVs may interfere with normal cellular functions in several ways. Although the vast majority of the HERV coding sequences are interrupted by stop codons and thus cannot produce functional proteins, if transcribed they may recombine, eventually producing infectious virus.
A large proportion of the human genome consists of different variable repetitive sequences. Its study is extremely difficult. To date, apart from incidental findings of the HERV sequences in the process of sequencing other genes, the majority of HERVs have been identified by low-stringency hybridisation using exogenous retroviral genome-derived probes. This method, however, can detect only closely-related sequences.
The most conserved region of retroviral genomes is the primer binding site (PBS). Specific PBS-derived oligonucleotides, or part of tRNA itself, have been used as hybridisation probes or primers in a primer extension reaction (Kroger et al (1987) J. Virol. 61:2071-5).
Another highly conserved region is the coding region for reverse transcriptase (RT). Some domains are conserved, not only among retroviruses, but also among other retroelements (Xiong et al (1990) EMBO J. 9:3353-62).
The polymerase chain reaction (PCR) has been successfully used to amplify a fragment between two conserved domains of RT using primers specific for certain groups of retroviruses (Shih et al (1989) J. Virol. 63:64-75). Another conserved sequence used for PCR is that of a protease gene. These amplification methods have their limitations, in that two sets of degenerate primers have to be used, and the amount of sequence information obtained is mostly limited to short regions of the RT gene.
There are some 1300 tRNA genes and pseudogenes in the human genome coding for some 60-90 tRNAs of which 20 have been sequenced so far. Some 11 tRNAs have been so far described to prime minus DNA strand synthesis in all retroviruses. Of those 7 are used by human exo- and endogenous retroviruses (see Table 1).
WO-A-9603528 discloses certain oligonucleotides for use in screening.