The present invention generally relates to a novel tissue-specific matrix- or scaffold-associating DNA region (MAR) binding protein. The invention additionally relates to the cloning of a human CDNA encoding the novel protein and antibodies specifically reactive with such protein.
Eukaryotic chromosomes are thought to be organized into a high order structure consisting of discrete and topologically independent loop domains, which would be fastened at their bases to the intranuclear framework by non-histone proteins as reviewed in Gasser and Laemmli , Trends Genet. 3:16-22 (1987). The loop organization of chromosomes may be important not only for compaction of the chromatin fiber, but also for the regulation of gene expression and replication. Each domain is believed to represent an independent unit of gene activity, which would be insulated from the regulatory mechanisms of neighboring domains and thus protected from chromosomal position effects.
The above model implies that specific DNA sequences exist at the bases of the DNA loops and proteins that bind to these sequences to separate one domain from another. Much effort has been devoted to identifying these sequences. A biochemical criteria used to define putative boundary sequences is their high binding affinity to the nuclear matrix or scaffold, which is defined as the residual structures left in the nucleus after removal of histones and other proteins. Specific DNA segments that strongly bind the nuclear matrix/scaffold have been identified in a number of different species including human, mouse, Drosophila, chicken, plant, and yeast as reviewed in Phi-Van and Stratling, Procg. Mol. Subcell. Biol. 11:1-11 (1990). These sequences are called MARs or SARs for matrix- or scaffold-associating regions (collectively referred to herein as MARS) Such MARs often contain or are located in close vicinity to regulatory sequences, including enhancer sequences.
Although some MARs are found in intragenic locations, most MARs are found at the boundaries of transcription units where they may delimit the ends of an active chromatin domain. Furthermore, A-elements of the chicken lysozyme gene, which contain DNA with high affinity to the nuclear matrix as described in Phi-Van and Stratling, EMBO J. 7:655-664 (1988), augment the transcriptional activity of a linked gene in a position-independent, copy number-dependent manner in stably transfected cells, suggesting that MARs can act as boundary sequences in vivo. The locus control region of the human .beta.-globin domain, characterized by a set of tissue-specific DNase I hypersensitive sites, also contains MARs and confers copy number-dependent high levels of erythroid-specific expression to a linked gene. The specific role of MARs in either A-elements or locus control region activity remains unclear. Recently, specialized chromatin structures (scs and scs') which are MAR-like AT rich sequences, located at the boundaries of a Drosophila heat-shock gene, were shown to insulate the regulatory influence of adjacent domains in Kellum and Schedl, Cell 64:941-950 (1991)
MARs are in general AT rich by approximately 70% and are preferentially bound and cleaved by topoisomerase II. However, there is no consensus sequence known for MARS. The topoisomerase II consensus derived from Drosophila and vertebrate is only loosely-defined. A specialized DNA structure formed by certain AT rich sequences may be important for their biological function. The significance of structural characteristics for MAR such as DNA bending and a narrow minor groove due to oligo(dA) tracts has been previously proposed.
By employing an unpaired DNA-specific probe, chloroacetaldehyde (CAA)(Kohwi-Shigematsu et al., Proc. Natl Acad. Sci. (U.S.A.) 80:4389-4393 (1983); Kohwi-Shigematsu and Kohwi, Cell 43:199-206 (1985)) it has been demonstrated that naturally occurring MARs from different species are characterized by their strong potential for extensive base-unpairing, or unwinding, when subjected to superhelical strain (Kohwi-Shigematsu and Kohwi, Biochem. 29:9551-9560 (1990) ). This unwinding property was shown to be important for binding to the nuclear matrix and for the augmentation of gene expression in stable transformants (Bode et al., Science 195-197 (1992)).
For example, two MARs flanking the immunoglobulin heavy chain (IgH) gene enhancer described in Cockerill, et J. Biol. Chem. 262:5394-5397 (1987), continuously unpaired over a distance of more than 200 base-pairs in supercoiled plasmid DNA. A short sequence motif, ATATATT within the MAR located 3' of the IgH enhancer was delineated to be a nucleation site for unwinding. Point mutations substituting three bases in this sequence completely abolished the unwinding property of the MAR. In a subsequent study (Bode et al., (1992) supra.) it was shown that a concatemerized, double-stranded 25 base pair oligonucleotide containing the unwinding core sequence of the 3' MAR behaved like a typical MAR. This synthetic MAR was capable of unwinding under superhelical strain, strongly bound to the nuclear matrix with an affinity comparable to that observed with the 2 kilobase MAR from the human .beta.-interferon (huIFN-b) gene, and enhanced transcription of a linked reporter gene in stable transformants. However, none of these features were observed with a similarly concatemerized, double-stranded oligonucleotide derived from the mutated core sequence: the unpairing property was lost, the binding affinity to the nuclear matrix was greatly reduced, and no enhancement of transcription was detected.
The unwinding property of MARs may be important in effectively relieving negative superhelical strain that could accumulate in a looped DNA domain and in preventing its influence an neighboring domains. If certain AT rich sequences are biologically significant due to their intrinsic structural property, it would be advantageous to identify a protein that recognizes and distinguishes AT rich sequences than can unwind from those that cannot unwind. Such a protein could be a MAR-binding protein. Except for topoisomerase II, little is known about scaffold proteins in higher eukaryotes. Recently, a MAR-binding protein, ARBP (Attachment Region Binding Protein), that binds to MARs from different species has been purified from chicken oviduct. However, a gene for ARBP has not yet been reported.
Thus, a need exists for identifying MAR binding proteins and nucleic acids encoding such proteins. The present invention satisfies this need and provides related advantages as well.