Locus Control Regions (LCRs) (Grosveld et al., Cell 51:975-985, 1987), also known as Dominant Activator Sequences, Locus Activating Regions or Dominant Control Regions, are responsible for conferring tissue specific, integration-site independent, copy number dependent expression on transgenes integrated into chromatin in host cells. The discovery and characterization of LCRs are described in co-pending U.S. Ser. No. 07/920,536, filed Jul. 28, 1992, assigned to the same assignee, the complete disclosure of which is hereby incorporated by reference. First discovered in the human globin gene system, which was prone to strong position effects when integrated into the chromatin of transgenic mice or mouse erythroleukaemia (MEL) cells (Magram et al., Nature 315:338-340, 1985; Townes et al., EMBO J. 4:1715-1723, 1985; Kollias et al., Cell 46:89-94, 1986; Antoniou et al., EMBO J. 7:377-384, 1988), LCRs have the ability to overcome such position effects when linked directly to transgenes (Grosveld et al., supra). Numerous LCRs have been defined in the art, including but not limited to the .beta.-globin and CD2 LCRs (European Patent Application 0 332 667), the macrophage-specific lysozyme LCR (Bonifer et al., 1985), and a class II MHC LCR (Carson et al., Nucleic Acids Res. 21, 9:2065-2072, 1993).
The complete .beta.-globin LCR comprises four DNase I hypersensitive sites (HS) on a 20 kbp fragment that is too large to be incorporated into retrovirus or adeno-associated virus (AAV) vectors designed for integration into the mammalian genome. Individual hypersensitive sites, in particular the 5'HS2 associated element, have been studied for the ability to regulate transduced globin genes (Novak et al., Proc. Natl. Acad. Sci. USA 87:3386-3390, 1990; Chang et al., Proc. Natl. Acad. Sci. USA 89:3107-3110, 1992; Miller et al., Blood 82:1900-1906, 1993). However, it has proven to be difficult to obtain stable high-titer viruses bearing these sequences.
When referring to the DNase I hypersensitive sites of the .beta.-globin LCR, care must be taken over the nomenclature used. Originally, the hypersensitive sites were numbered consecutively 1 to 4 working upstream from the globin genes by Tuan et al., (Proc. Natl Acad. Sci. USA 86:2554-2558, 1985) and downstream towards the gene by Grosveld et al. (supra). In 1990, agreement was reached to use the nomenclature in which 5'HS1 is closest to the globin genes and 5'HS4 is most distant from the genes. The GenBank numbering for HS2, HS3 and HS4 is GenBank 958-1714, GenBank 4248-5197, and GenBank 8486-8860, respectively. A number of publications dating back from around or before 1990 use the inverse nomenclature.
Previous work demonstrated that each hypersensitive site of the human .beta.-globin locus control region confers a different developmental pattern of expression on the globin genes (Fraser et al., 1993, Genes & Development 7:106-113). HS3 was shown to be most active during the embryonic period, whereas HS4 showed the highest activity during the adult stage. Each of HS1 and HS2 drive equivalent levels, albeit low and high levels, respectively, of .gamma. or .beta. transgene expression throughout development.
Previous work demonstrated that the 5' HS2 core region, i.e., a 215 bp fragment containing four putative transcription factor binding sites, functions in a concatemer of at least two copies but not when present as a single copy in transgenic mice to confer position independent expression of a linked transgene (Ellis et al., Eur. Mol. Biol. J. 12:127-134, 1993). Thus, two or more 5'HS2 cores may interact and cooperate with each other to open chromatin and enhance transcription, however, a single 5'HS2 core fails to activate expression from a linked .beta.-globin gene in single-copy founder transgenic mice. This failure of the single copy HS2/transgene construct to activate transgene transcription was demonstrated unequivocally using fully transgenic F.sub.1 offspring. No such data exists in the prior art for the HS3 or HS4 .beta.-globin LCR subregions.
HS3 and HS4 have only been tested in founder (F.sub.0) animal tissue, that is, using embryo tissue that has been grown from injected eggs, which tissue can carry different copy numbers of a transgene in different cells. (Philipsen et al., 1993, EMBO J. 12:1077-1085; Pruzina et al., 1991, Nucleic Acids Res. 19;1413-1419).
Studies using founder animal tissue are highly inconclusive with respect to transgene copy number because copy number cannot be determined definitively. Because a transgene integrates into the injected egg after the single cell stage, different tissues almost always contain different copy numbers (or no copies) of a transgene. Therefore, reliable data as to expression of an HS3/or HS4/transgene construct in single copy cannot be obtained using founder animals. There is no indication in founder animals of the extent to which the transgene has integrated into the animal's somatic tissues. For instance, a nominal copy number of two could indicate the presence of two copies of the transgene in each cell, or four copies in half of the cells, or eight copies in one quarter of the cells, and so on. In addition, because of the minute amount of embryo tissue available, e.g., fetal liver tissue, copy number analysis in founder animals is performed on tissues other than that used for analysis of the expression level of the transgene. However, true copy number can be determined reliably in F.sub.1 generation animals in which the transgene has been passed through the germ line by breeding of the founder animal. F.sub.1 transgenic animals contain an equal number of copies of the transgene in each cell.
When experiments are conducted on single-copy transgenic animals, it is found that many of the LCR fragments previously believed to confer LCR activity are incapable of satisfying the functional requirement of an LCR, namely the conferring of integration-site independent expression on a transgene, when present in a single copy. Clearly, such DNA elements are inappropriate for protocols where the use of single copy gene is desirable, essential or inevitable, as in the case of many virus-based delivery systems.
Previously, it has been found that the full activity of an LCR appears to be obtainable only with complete LCR sequences. Thus, in the .beta.-globin LCR, only a construct containing the DNA sequences surrounding all four of the DNase I hypersensitive sites 1 to 4 confers tissue-specific, integration site independent, copy number dependent expression of a transgene at levels reflecting the level of expression of an equivalent endogenous gene.
It is an object of the invention to provide for reproducible integration-site independent expression of a transgene in a mammal, particularly a human, when the transgene is integrated in single copy in the mammalian genome.
Another object of the invention is to identify a sub-fragment of an LCR that reproducibly confers the chromatin opening activity of the LCR when present in single copy.
Yet another object of the invention is to provide an LCR subregion which, when operatively associated with a transgene and integrated in single copy in a host cell genome, reproducibly confers integration-site independent, tissue-specific expression on the transgene.
Another object of the invention is to provide for reproducible, tissue-specific, integration-site independent expression of a single copy transgene using gene transfer techniques that are limited with respect to the amount of DNA that is transferred to a host genome.