The human endogenous retroviruses (HERVs) were inserted into the germ cells of primates millions of years ago and have remained as an integral part of the primate genomes during evolution. In addition to the proviruses, solo LTRs are also dispersed throughout the human genome (Wilkinson et al, 1994; Lower et al, 1996). The solo LTRs contain the U3, R and U5 regions (Temin, 1982) but no internal gag, pol and env genes. Together, the HERVs and the solo LTRs comprise approximately 5% of the human genome and belong to the category of middle repetitive DNAs characterized as retrotransposons (A.F. Smit, 1996; Henikoff et al, 1997).
The ERV-9 proviruses, containing 30-50 members, constitute one of many families of the HERVs (Wilkinson et al, 1994; Lower et al, 1996). In addition to the proviruses, solo ERV-9 LTRs with a copy number of 3000-4000 have been found in the human genome (Henthorn et al, 1986; La Mantia et al, 1991; Zucchi and Schlessinger, 1992). The ERV-9 retrotransposons were inserted into the primate genome probably as early as ten million years ago (Di Cristofano et al, 1995). The retrotransposons have been suggested to be selfish DNAs irrelevant to the cellular functions of the hosts (Doolittle and Sapienza, 1980). However, recent findings indicate that the enhancer and promoter elements in the U3 region of the LTRs (Lenz et al, 1984; Speck et al, 1990) initiate and promote the transcription of host genes located immediately downstream of the LTRs and may thus serve relevant cellular functions (Stavenhagen and Robins, 1988; Feuchter et al, 1992; Goodchild et al, 1992; Ting et al, 1992; Schulte et al, 1996).
The human xcex2-like globin genes consist of the embryonic xcex5 the fetal Gxcex3 and Axcex3, and the adult xcex4 and xcex2 genes located on Chromosome 11 in a transcriptional order of 5xe2x80x2 xcex5-Gxcex3-Axcex3-xcex4-xcex2 3xe2x80x2 (Efstratiadis et al, 1980). The transcription of these genes is regulated by the far upstream Locus Control Region (LCR), which is defined by four erythroid specific, DNase I hypersensitive sites HS 1, 2, 3 and 4 (Tuan et al, 1985; Forrester et al, 1987; Grosveld et al, 1987; Dhar et al, 1990). The LCR between HS1 and HS4 is present in other mammals from mouse to galago and comprises the major functional component of the LCR (reviewed by Hardison et al, 1997). A ubiquitous HS5 site has been identified further upstream of the HS 1-4 sites (Tuan et al, 1985; Dhar et al, 1990) in the apparent 5xe2x80x2 boundary area of the LCR.
Enhancer elements are cis-acting and increase the level of transcription of an adjacent gene from its promoter in a fashion that is relatively independent of the position and orientation of the enhancer element. In fact, Khoury and Gruss, 1983, Cell 33:313, state that xe2x80x9cthe remarkable ability of enhancer sequences to function upstream from, within, or downstream from eukaryotic genes distinguishes them from classical promoter elements . . . xe2x80x9d and suggest that certain experimental results indicate that xe2x80x9cenhancers can act over considerable distances (perhaps  greater than 10 kb).xe2x80x9d
Enhancer elements have been identified in a number of viruses, including polyoma virus, papilloma virus, adenovirus, retrovirus, hepatitis virus, cytomegalovirus, herpes virus, papovaviruses, such as simian virus 40 (SV40) and BK, and in many non-viral genes, such as within mouse immunoglobulin gene introns. Enhancer elements may also be present in a wide variety of other organisms. Host cells often react differently to different enhancer elements. This cellular specificity indicates that host gene products interact with the enhancer element during gene expression.
Although gene replacement by homologous recombination could be used instead of integrating vectors, this approach is not yet technically practical because of the very low success rate of the homologous recombination events and the inability to culture the pluripotent stem cells required for this approach.
Disclosed are an enhancer, insulator, and promoter from the HS5 region in the 5xe2x80x2 boundary area of the locus control region of human xcex2-like globin genes. These transcription control sequences can be used to control expression of any desired gene of interest and can be used in any vector for this purpose. The control sequences are derived from the area in and around the U3 region of a solitary endogenous retrovirus (ERV) 9 long terminal repeat (LTR).
Also disclosed are methods of expressing any gene of interest. For this purpose, the control sequences can be operably linked to the gene of interest (and operably linked to each other). The disclosed enhancers, insulators, and promoters can also be used with any other control sequences. Preferably, the control sequences are used in vectors to obtain expression of a gene of interest in a cell, including cells in animals.