1. Field of the Invention
The present invention relates to genes which play a part in the structural and functional regulation of chromatin, and their use in therapy and diagnosis.
2. Related Art
Higher-order chromatin is essential for epigenetic gene control and for the functional organization of chromosomes. Differences in higher-order chromatin structure have been linked with distinct covalent modifications of histone tails which regulate transcriptional xe2x80x98onxe2x80x99 or xe2x80x98offxe2x80x99 states and influence chromosome condensation and segregation.
Histones constitute a highly conserved family of proteins (H3, H4, H2A, H2B, H1) which are the major components of eucaryotic chromatin structure. Histones compact genomic DNA into basic repeating structural units, the nucleosomes. In addition to their DNA packaging function, histones have been proven to be integral components of the molecular machinery that regulates gene expression.
Post-translational modifications of histone N-termini, particularly of H4 and H3, are well-documented and have functionally been characterized as changes in acetylation, phosphorylation and, most recently, methylation. In contrast to the large number of described histone acetyltransferases (HATs) and histone deacetylases (HDACs), genes encoding enzymatic activities that regulate phosphorylation or methylation of histone N-termini are only beginning to be identified. Moreover, the interdependence of the different histone tail modifications for the integration of transcriptional output or higher-order chromatin organization is currently not understood.
Overall, there is increasing evidence that the regulation of normal and aberrant cellular proliferation is not only affected on the transcriptional level, but that also a higher level of regulation is involved, i.e., the organization of chromatin structure through the modification of histone molecules. The determination of the proteins and the molecular mechanisms involved in histone modification will contribute to the understanding of the cellular proliferation program and will thus shed light on the mechanisms involved in aberrant proliferation occurring in tumor formation and progression.
The functional organization of eucaryotic chromosomes in centromeres, telomeres and eu- and heterochromatic regions is a crucial mechanism for ensuring exact replication and distribution of genetic information on each cell division. By contrast, tumor cells are frequently characterized by chromosomal rearrangements, translocations and aneuploidy (Solomon, et al., Science 254:1153-1160 (1991); Pardue, Cell 66:427-431 (1991)).
Although the mechanisms which lead to increased chromosome instability in tumor cells have not yet been clarified, a number of experimental systems, beginning with telomeric positional effects in yeast (Renauld, et al., Genes and Dev. 7:1133-1145 (1993); Buck and Shore, Genes and Dev. 9:370-384 (1995); Allshire, et al., Cell 76:157-169 (1994)). via positional effect variegation (PEV) in Drosophila (Reuter and Spierer, BioEssays 14:605-612 (1992)), and up to the analysis of translocation fracture points in human leukaemias (Solomon, et al., Science 254:1153-1160 (1991); Cleary, et al., Cell 66:619-622 (1991)), have made it possible to identify chromosomal proteins which are involved in causing deregulated proliferation.
First, it was found that the overexpression of a shortened version of the SIR4-protein leads to a longer life in yeast (Kennedy, et al., Cell 80:485-496 (1995)). Since SIR proteins contribute to the formation of multimeric complexes at the stationary mating type loci and at the telomere, it could be that overexpressed SIR4 interferes with these heterochromatin-like complexes, finally resulting in uncontrolled proliferation. This assumption accords with the frequency of occurrence of a deregulated telomere length in most types of human cancer (Counter, etal., Embo. J. 11:1921-1928 (1992)).
Second, genetic analyses of PEV in Drosophila have identified a number of gene products which alter the structure of chromatin at heterochromatic positions and within the homeotic gene cluster (Reuter and Spierer, BioEssays 14:605-612 (1992)). Mutations of some ofthese genes, such as modulo (Garzino, et al., Embo J. 11:4471-4479 (1992)) andpolyhomeotic (Smouse and Perrimon, Dev. Biol. 139:169-185 (1990)), can cause deregulated cell proliferation or cell death in Drosophila.
Third, mammalian homologues of both activators, e.g., trithorax or trx-group, and also repressors, e.g., polycomb or Pc-group, of the chromatin structure of homeotic Drosophila selector genes have been described. Among these, human HRX/ALL-1 (trx-group) has been shown to be involved in leukaemogenesis induced by translocation (Tkachuk, et al., Cell 71:691-700 (1992); Gu, et al., Cell 71:701-708 (1992)), and it has been shown that the overexpression of murine bmi (Pc-group) leads to the formation of lymphomas (Haupt, et al., Cell 65:753-763 (1991); Brunk, et al., Nature 353:351-355 (1991); Alkema, et al., Nature 374:724-727 (1995)). A model for the function of chromosomal proteins leads one to conclude that they form multimeric complexes which determine the degree of condensation of the surrounding chromatin region depending on the balance between activators and repressors in the complex (Locke, et al., Genetics 120:181-198 (1988)). A shift in this equilibrium, caused by overexpression of one of the components of the complex, exhibited a new distribution of eu- and heterochromatic regions (Buck and Shore, Genes and Dev. 9:370-384 (1995); Reuter and Spierer, BioEssays 14:605-612 (1992); Eissenberg, et al., Genetics 131:345-352 (1992)) which can destabilize the chromatin structure at predetermined loci, and lead to a transition from the normal to the transformed state.
In spite of the characterization of HRX/ALL-1 and bmi as protooncogenes which are capable of changing the chromatin structure, knowledge of mammalian gene products which interact with chromatin is still very limited. By contrast, by genetic analyses of PEV in Drosophila, about 120 alleles for chromatin regulators have been described (Reuter and Spierer, BioEssays 14:605-612 (1992)).
Recently, a carboxy-terminal region was identified with similarity in the sequence to a positive (trx (trx-group)) and a negative (E(z) (Pc-group)) Drosophila chromatin regulator (Jones and Gelbart, MCB 13(10):6357-6366 (1993)). Moreover, this carboxy terminus is conserved in Su(var)3-9, a member of the Su(var) group, and a dominant suppressor of chromatin distribution in Drosophila (Tschiersch, et al., Embo J. 13(16):3822-3831 (1994)).
Genetic screens for suppressors of position effect variegation (PEV) in Drosophila and S. pombe have identified a subfamily of approximately 30-40 loci which are referred to as Su(var)-group genes. Interestingly, several histone deacetylases, protein phosphatase type 1 and S-adenosyl methionine synthetase have been classified as Su(var)s. In contrast, Su(var)2-5 (which is allelic to HP1), Su(var)3-7 and Su(var)3-9 encode heterochromatin-associated proteins. Su(var) gene function thus suggests a model in which modifications at the nucleosomal level may initiate the formation of defined chromosomal subdomains that are then stabilized and propagated by heterochromatic SU(VAR) proteins. Su(var)3-9 is dominant over most PEV modifier mutations, and mutants in the corresponding S. pombe clr4 gene disrupt heterochromatin association of other modifying factors and result in chromosome segregation defects. Recently, human (SUV39H1) and murine (Suv39h1 and Suv39h2) Su(var)3-9 homologues have been isolated. It has been shown that they encode heterochromatic proteins which associate with mammalian HP1. The SU(VAR)3-9 protein family combines two of the most evolutionarily conserved domains of xe2x80x98chromatin regulatorsxe2x80x99: the chromo and the SET domain. Whereas the 60 amino acid chromo domain represents an ancient histone-like fold that directs eu- or heterochromatic localizations, the molecular role of the 130 amino acid SET domain has remained enigmatic. Overexpression studies with human SUV39H1 mutants indicated a dominant interference with higher-order chromatin organization that, surprisingly, suggested a functional relationship between the SET domain and the distribution of phosphorylated (at serine 10) H3.
The experiments of the present invention show that mammalian SUV39H1 or Suv39h proteins are SET domain-dependent, H3-specific histone methyltransferases (HMTases) which selectively methylate lysine 9 of the H3 N-teminus. Methylation of lysine 9 negatively regulates phosphorylation of serine 10 and reveals a xe2x80x98histone codexe2x80x99 that appears intrinsically linked to the organization of higher-order chromatin.
The Su(var)3-9 protein family combines two of the most evolutionarily conserved domains of chromatin regulators: the chromo (Aasland, R. and Stewart, A. F., Nucleic Acids Res 23:3168-74 (1995); Koonin, E. V., et al., Nucleic Acids Res 23:4229-33 (1995)) and the SET (Jenuwein, T., et al., Cell Mol Life Sci 54:80-93 (1998)) domain. Whereas the 60 amino acid chromo domain represents an ancient histone-like fold (Ball, L. J., et al., EMBO J 16:2473-2481 (1997)) that directs eu- or heterochromatic localizations (Platero. J. S., et al., Embo J 14:3977-86 (1995)), the molecular role of the 130 amino acid SET domain has remained enigmatic.
The present invention started from the premise that the protein domain referred to as xe2x80x9cSETxe2x80x9d (Tschiersch, et al., Embo J. 13(16):3822-3831 (1994)) defines a new genetic family of mammalian chromatin regulators which are important in terms of their developmental history on account of their evolutionary conservation and their presence in antagonistic gene products. Moreover, the characterization of other members of the group of SET domain genes, apart from HRX/ALL-1, helps to explain the mechanisms which are responsible for structural changes in chromatin possibly leading to malignant transformation.
One aspect of the present invention is therefore to identify mammalian, such as human and murine, chromatin regulator genes, clarify their function and use them for diagnosis and therapy. More specifically, the sequences of the SUV39H proteins, and variants thereof, and EZH2 proteins, and variants thereof, according to the invention, may be used to analyze the interaction of SET domain proteins with chromatin or with other members of heterochromatin complexes. Starting from the findings thus obtained regarding the mode of activity of these proteins, the detailed possibilities for targeted intervention in the mechanisms involved therein are defined and may be used for therapeutic applications as described in detail below.
In order to achieve this objective, the sequence information of the SET domain was used to obtain the human cDNA homologous to the SET domain genes of Drosophila from human CDNA banks. Two cDNAs were obtained which constitute human homologues of E(z) and Su(var)3-9. The corresponding human genes are referred to as EZH2 and SUV39H. See FIGS. 6 and 7. In addition, a variant form of EZH2 was identified which is referred to as EZH1. See FIG. 8.
The present invention thus relates to DNA molecules containing a nucleotide sequence coding for a chromatin regulator protein which has a SET-domain, or a partial sequence thereof, characterized in that the nucleotide sequence is that shown in FIG. 6 (SEQ ID NO:1), or a partial sequence thereof, or FIG. 7 (SEQ ID NO:3), or a partial sequence thereof. The DNA molecules, including variants and mutants thereof such as dominant-negative mutants, are also referred to as xe2x80x9cgenes according to the invention.xe2x80x9d Two examples of genes according to the invention are designated EZH2 and SUV39H. They were originally referred to as xe2x80x9cHEZ-2xe2x80x9d and xe2x80x9cH3-9, xe2x80x9d respectively.
According to another aspect, the invention relates to the cDNAs derived from the genes of the invention, including the degenerate variants thereof, and mutants thereof, which code for functional chromatin regulators and which can be traced back to gene duplication. An example of this is EZH1 (SEQ ID NO:5), the partial sequence of which is shown by comparison with EZH2 (SEQ ID NO:1) in FIG. 8.
According to another aspect, the invention relates to recombinant DNA molecules containing the cDNA molecules, functionally connected to expression control sequences, for expression in procaryotic or eucaryotic host organisms. Thus, the invention further relates to procaryotic or eucaryotic host organisms transformed with the recombinant DNA.
The invention further relates to antisense(deoxy)ribonucleotides with complementarity to a partial sequence of an inventive DNA molecule.
The invention further relates to transgenic animals, such as transgenic mice, which comprise a trans gene for the expression of a chromatin regulator gene which has a SET domain, or a mutated version or degenerate variant of such a protein.
The invention further relates to knock-out animals such as knock-out mice, obtainable from embryonic stem cells in which the endogenous mouse loci for EZH1 and SUV39H are interrupted by homologous recombination.
The invention further relates to a process for identifying mammalian chromatin regulator genes which have a SET domain, or mutated versions thereof, wherein mammalian cDNA or genomic DNA libraries are hybridized under non-stringent conditions with a DNA molecule coding for the SET domain or a portion thereof.
The invention further relates to antibody molecules which bind to a polypeptide which contains the amino acid sequence depicted in SEQ ID NOS:2 or 4 or degenerate variants or mutants thereof.
Other aspects of the invention are set forth in the Detailed Description of the Preferred Embodiments.