When one considers that genome sequencing efforts have revealed that the human genome contains between 20,000 and 25,000 genes, but fewer than 2000 transcriptional regulators, it becomes clear that a number of factors must interact to control gene expression in all its various temporal, developmental and tissue specific manifestations. Expression of genes is controlled by a highly complex mixture of general and specific transcriptional regulators and expression can also be controlled by cis-acting DNA elements. These DNA elements comprise both local DNA elements such as the core promoter and its associated transcription factor binding sites as well as distal elements such as enhancers, silencers, insulators and locus control regions (LCRs) (see Maston, et al. (2006) Ann Rev Genome Hum Genet 7: 29-50).
Enhancer elements were first identified in the SV40 viral genome, and then found in the human immunoglobulin heavy chain locus. Now known to play regulatory roles in the expression of many genes, enhancers appear to mainly influence temporal and spatial patterns of gene expression. It has also been found that enhancers function in a manner that is not dependent upon distance from the core promoter of a gene, and is not dependent on any specific sequence orientation with respect to the promoter. Enhancers can be located several hundred kilobases upstream or downstream of a core promoter region, where they can be located in an intron sequence, or even beyond the 3′ end of a gene.
Various methods and compositions for targeted cleavage of genomic DNA have been described. Such targeted cleavage events can be used, for example, to induce targeted mutagenesis, induce targeted deletions of cellular DNA sequences, and facilitate targeted recombination at a predetermined chromosomal locus. See, e.g., U.S. Pat. Nos. 9,255,250; 9,200,266; 9,045,763; 9,005,973; 9,150,847; 8,956,828; 8,945,868; 8,703,489; 8,586,526; 6,534,261; 6,599,692; 6,503,717; 6,689,558; 7,067,317; 7,262,054; 7,888,121; 7,972,854; 7,914,796; 7,951,925; 8,110,379; 8,409,861; U.S. Patent Publication Nos. 2003/0232410; 2005/0208489; 2005/0026157; 2005/0064474; 2006/0063231; 2008/0159996; 2010/0218264; 2012/0017290; 2011/0265198; 2013/0137104; 2013/0122591; 2013/0177983; 2013/0196373; 2014/0093913; 2015/0056705; 2015/0335708; and 2015/0132269, the disclosures of which are incorporated by reference in their entireties for all purposes. These methods often involve the use of engineered cleavage systems to induce a double strand break (DSB) or a nick in a target DNA sequence such that repair of the break by an error born process such as non-homologous end joining (NHEJ) or repair using a repair template (homology directed repair or HDR) can result in the knock out of a gene or the insertion of a sequence of interest (targeted integration). This technique can also be used to introduce site specific changes in the genome sequence through use of a donor oligonucleotide, including the introduction of specific deletions of genomic regions, or of specific point mutations or localized alterations (also known as gene correction). Cleavage can occur through the use of specific nucleases such as engineered zinc finger nucleases (ZFN), transcription-activator like effector nucleases (TALENs), or using a CRISPR/Cas system (including Cfp1) with an engineered crRNA/tracr RNA (‘single guide RNA’) to guide specific cleavage. Further, targeted nucleases are being developed based on the Argonaute system (e.g., from T. thermophilus, known as ‘TtAgo’, see Swarts, et al. (2014) Nature 507(7491): 258-261), which also may have the potential for uses in genome editing and gene therapy.
Red blood cells (RBCs), or erythrocytes, are the major cellular component of blood. In fact, RBCs account for one quarter of the cells in a human. Mature RBCs lack a nucleus and many other organelles in humans, and are full of hemoglobin, a metalloprotein that functions to carry oxygen to the tissues as well as carry carbon dioxide out of the tissues and back to the lungs for removal. This protein makes up approximately 97% of the dry weight of RBCs and it increases the oxygen carrying ability of blood by about seventy fold. Hemoglobin is a heterotetramer comprising two alpha (α)-like globin chains and two beta (β)-like globin chains and 4 heme groups. In adults the α2β2 tetramer is referred to as Hemoglobin A (HbA) or adult hemoglobin. Typically, the alpha and beta globin chains are synthesized in an approximate 1:1 ratio and this ratio seems to be critical in terms of hemoglobin and RBC stabilization. In a developing fetus, a different form of hemoglobin, fetal hemoglobin (HbF), is produced which has a higher binding affinity for oxygen than Hemoglobin A such that oxygen can be delivered to the baby's system via the mother's blood stream. Fetal hemoglobin also contains two a globin chains, but in place of the adult β-globin chains, it has two fetal gamma (γ)-globin chains (i.e., fetal hemoglobin is α2γ2). At approximately 30 weeks of gestation, the synthesis of gamma globin in the fetus starts to drop while the production of beta globin increases. By approximately 10 months of age, the newborn's hemoglobin is nearly all α2β2 although some HbF persists into adulthood (approximately 1-3% of total hemoglobin). The regulation of the switch from production of gamma- to beta-globin is quite complex, and primarily involves a down-regulation of gamma globin transcription with a simultaneous up-regulation of beta globin transcription.
Genetic defects in the sequences encoding the hemoglobin chains can be responsible for a number of diseases known as hemoglobinopathies, including sickle cell anemia and thalassemias. In the majority of patients with hemoglobinopathies, the genes encoding gamma globin remain present, but expression is relatively low due to normal gene repression occurring around parturition as described above.
It is estimated that 1 in 5000 people in the U.S. have sickle cell disease (SCD), mostly in people of sub-Saharan Africa descent. There appears to be a benefit for heterozygous carriers of the sickle cell mutation for protection against malaria, so this trait may have been positively selected over time, such that it is estimated that in sub-Saharan Africa, one third of the population has the sickle cell trait. Sickle cell disease is caused by a mutation in the β globin gene as a consequence of which valine is substituted for glutamic acid at amino acid #6 (a GAG to GTG at the DNA level), where the resultant hemoglobin is referred to as “hemoglobinS” or “HbS.” Under lower oxygen conditions, a conformational shift in the deoxy form of HbS exposes a hydrophobic patch on the protein between the E and F helices. The hydrophobic residues of the valine at position 6 of the beta chain in hemoglobin are able to associate with the hydrophobic patch, causing HbS molecules to aggregate and form fibrous precipitates. These aggregates in turn cause the abnormality or ‘sickling’ of the RBCs, resulting in a loss of flexibility of the cells. The sickling RBCs are no longer able to squeeze into the capillary beds and can result in vaso-occlusive crisis in sickle cell patients. In addition, sickled RBCs are more fragile than normal RBCs, and tend towards hemolysis, eventually leading to anemia in the patient.
Treatment and management of sickle cell patients is a life-long proposition involving antibiotic treatment, pain management and transfusions during acute episodes. One approach is the use of hydroxyurea, which exerts its effects in part by increasing the production of gamma globin. Long term side effects of chronic hydroxyurea therapy are still unknown, however, and treatment gives unwanted side effects and can have variable efficacy from patient to patient. Despite an increase in the efficacy of sickle cell treatments, the life expectancy of patients is still only in the mid to late 50's and the associated morbidities of the disease have a profound impact on a patient's quality of life.
Thalassemias are also diseases relating to hemoglobin and typically involve a reduced expression of globin chains. This can occur through mutations in the regulatory regions of the genes or from a mutation in a globin coding sequence that results in reduced expression or reduced levels or functional globin protein. Alpha thalassemias are mainly associated with people of Western Africa and South Asian descent, and may confer malarial resistance. Beta thalassemia is mainly associated with people of Mediterranean descent, typically from Greece and the coastal areas of Turkey and Italy. Treatment of thalassemias usually involves blood transfusions and iron chelation therapy. Bone marrow transplants are also being used for treatment of people with severe thalassemias if an appropriate donor can be identified, but this procedure can have significant risks.
One approach that has been proposed for the treatment of both SCD and beta thalassemias is to increase the expression of gamma globin with the aim to have HbF functionally replace the aberrant adult hemoglobin. As mentioned above, treatment of SCD patients with hydroxyurea is thought to be successful in part due to its effect on increasing gamma globin expression. The first group of compounds discovered to affect gamma globin reactivation activity were cytotoxic drugs. The ability to cause de novo synthesis of gamma-globin by pharmacological manipulation was first shown using 5-azacytidine in experimental animals (DeSimone (1982) Proc Nat'l Acad Sci USA 79(14):4428-31). Subsequent studies confirmed the ability of 5-azacytidine to increase HbF in patients with β-thalassemia and sickle cell disease (Ley, et al. (1982) N. Engl. J. Medicine, 307: 1469-1475, and Ley, et al. (1983) Blood 62: 370-380). In addition, short chain fatty acids (e.g. butyrate and derivatives) have been shown in experimental systems to increase HbF (Constantoulakis, et al. (1988) Blood 72(6):1961-1967). Also, there is a segment of the human population with a condition known as ‘Hereditary Persistence of Fetal Hemoglobin’ (HPFH) where elevated amounts of HbF persist in adulthood (10-40% in HPFH heterozygotes (see Thein, et al. (2009) Hum. Mol. Genet 18 (R2): R216-R223). This is a rare condition, but in the absence of any associated beta globin abnormalities, is not associated with any significant clinical manifestations, even when 100% of the individual's hemoglobin is HbF. When individuals that have a beta thalassemia also have co-incident HPFH, the expression of HbF can lessen the severity of the disease. Further, the severity of the natural course of sickle cell disease can vary significantly from patient to patient, and this variability, in part, can be traced to the fact that some individuals with milder disease express higher levels of HbF.
One approach to increase the expression of HbF involves identification of genes whose products play a role in the regulation of gamma globin expression. One such gene is BCL11A, first identified because of its role in lymphocyte development. BCL11A encodes a zinc finger protein that is thought to be involved in the developmental stage-specific regulation of gamma globin expression. BCL11A is expressed in adult erythroid precursor cells and down-regulation of its expression leads to an increase in gamma globin expression. In addition, it appears that the splicing of the BCL11A mRNA is developmentally regulated. In embryonic cells, it appears that the shorter BCL11A mRNA variants, known as BCL11A-S and BCL11A-XS are primary expressed, while in adult cells, the longer BCL11A-L and BCL11A-XL mRNA variants are predominantly expressed. See, Sankaran, et al. (2008) Science 322 p. 1839. The BCL11A protein appears to interact with the beta globin locus to alter its conformation and thus its expression at different developmental stages. Use of an inhibitory RNA targeted to the BCL11A gene has been proposed (see, e.g., U.S. Patent Publication No. 2011/0182867) but this technology has several potential drawbacks, namely that complete knock down may not be achieved, delivery of such RNAs may be problematic and the RNAs must be present continuously, requiring multiple treatments for life.
Targeting of BCL11A enhancer sequences provides a mechanism for increasing HbF. See, e.g., U.S. Patent Publication No. 2015/0132269. Genome wide association studies have identified a set of genetic variations at BCL11A that are associated with increased HbF levels. These variations are a collection of SNPs found in non-coding regions of BCL11A that function as a stage-specific, lineage-restricted enhancer region. Further investigation revealed that this BCL11A enhancer is required in erythroid cells for BCL11A expression, but is not required for its expression in B cells (see Bauer, et al. (2013) Science 343:253-257). The enhancer region was found within intron 2 of the BCL11A gene, and three areas of DNAseI hypersensitivity (often indicative of a chromatin state that is associated with regulatory potential) in intron 2 were identified. These three areas were identified as “+62”, “+58” and “+55” in accordance with the distance in kilobases from the transcription start site of BCL11A. These enhancer regions are roughly 350 (+55); 550 (+58); and 350 (+62) nucleotides in length (Bauer 2013, ibid).
Histone modifications were also observed in this region in primary human erythroblasts that contained SNPs associated with increased HbF expression. Additionally, it was shown that the erythroid related transcription factors GATA1 and TAL1 bind within the BCL11A intron 2 region. GATA transcription factors are zinc finger DNA binding proteins that control development in many different tissues by activating or repressing gene expression. GATA factors typically bind to the element A/T GATA A/G and were initially characterized by their involvement in the expression of erythroid-specific genes. Now several members of the GATA transcription factor family have been described that play essential roles in the expression of genes in a number of cell types (Zheng and Blobel (2011) Genes & Cancer 1(12):1178-1188). TAL1 is a transcription factor of the basic helix-loop-helix class. It was initially identified as activated in T cell leukemia, but then shown (Shivdasani, et al. (1995) Nature 373:432) to be essential for hematopoiesis broadly, and specifically within the context of erythropoiesis, to be required at the myelo-erythroid stage. TAL-1 is an obligate heterodimer, and genome-wide occupancy data have shown that TAL-1 and GATA-1 commonly co-occupy and co-regulate target genes during erythropoiesis (Wu, et al. (2014) Gen Res 24:1945). GATA-1/TAL-1 heterodimers typically bind to the motif NT/CTATCT/ANNNNNNNNCAG/C (SEQ ID NO:1), termed the GATA1:TAL1 binding motif (Kassouf, et al. (2010) Gen Res 20:1064).
RREB1 (Ras Responsive Element Binding protein) is a broadly expressed transcription factor that contains a C2H2 zinc finger protein DNA binding domain. It was initially identified as required for the response of specific genes to Ras signaling (Thiagalingam, et al. (1996) Mol Cell Bio 16(10):5335). More recently, Chen, et al. (2010), J Biol Chem 285(14):10189) have discovered that the gene encoding the embryonic/fetal form of the alpha globin gene (known as zeta globin) is bound by, and is regulated, by RREB1, thus pointing to its specific function as a regulator of the fetal-to-adult globin transition during erythropoiesis. The DNA motif that RREB1 appears to bind to is GTGGGTGG/T (Xie, et al. (2005) Nature 434(7031):338).
LMO2 belongs to a subset of the large LMO family of zinc finger proteins (Wadman, et al. (1997) EMBO J16(11): 3145). These proteins comprise two LIM DNA binding proteins and are thought to be scaffolding proteins involved in multiprotein complex formation. The LMO2 complex is an essential transcriptional regulator in hematopoiesis, whose inappropriate regulation can contribute to the development of leukemia. The complex appears to interact with the DNA in two locations: the first is at an E box motif (5′CACGTG) and the second is a nearby GATA-1 binding motif.
Transcription factor chicken ovalbumin upstream promoter transcription factor II (COUP-TF2 (also known as NR2F2)) is widely expressed during embryonic development in a number of tissues and is part of the nuclear receptor (NR) superfamily of ligand-activated receptors. It is also a zinc finger protein and is involved in activating or repressing gene expression depending on direct binding to its motif and/or interaction with other transcription factors. COUP-TF2 recognizes a DNA motif with the following sequence: 5′ AGGTCA and serves as one of the master regulators to control developmental programs, including organogenesis, angiogenesis, cardiovascular development, reproduction, neuronal development and metabolic homeostasis (Qin, et al. (2014) Cell Biosci 4:58).
The transcription factor NR2F1 is closely related to COUP-TF2, sharing 97% and 99% homology in its ligand and DNA binding domains respectively. Both NR2F1 and COUP-TF2 form homodimers and bind the same DNA motif (Lin, et al. (2011) Endocr Rev 32(3):404).
C-Myb can act as a transcription factor in hematopoietic cells and can be regulated by lineage specific components (see Frampton, et al. (1993) EMBO J 12(4):1333). It interacts with the DNA binding motif 5′ C/TAACTGC/TCA/T (SEQ ID NO:2).
The AP-2 family of transcription factors consists of five different proteins in humans and mice: AP-2α, AP-2β, AP-2γ, AP-2δ and AP-2ε. The proteins have a characteristic helix-span-helix motif at the carboxyl terminus, which, together with a central basic region, mediates dimerization and DNA binding. AP-2 has been shown to bind to the palindromic consensus sequence 5′-GCCN3GGC-3′ found in various cellular and viral enhancers and seems to be involved with embryonic development (Eckert, et al. (2005) Genome Biol 6:246).
There remains a need for additional methods and compositions for the alteration of BCL11A gene expression for example to treat hemoglobinopathies such as sickle cell disease and beta thalassemia.