This invention is in the field of plant molecular biology. More specifically, this invention pertains to nucleic acid fragments encoding proteins that interact with nuclear matrix proteins and function as transcriptional activators.
The nuclear matrix hypothesis proposes a structural framework for the eukaryotic nucleus that is similar to the cytoskeleton. To date, its best characterized component is the lamina, a filamentous protein network that lines the inner membrane of the nuclear envelope. Major components of the lamina include a group of intermediate-filament (IF) proteins, collectively known as nuclear lamins, that are classified as type A, B, and C (McKeon et al., Nature 319:463-468 (1986)). Lamin B is attached to the inner nuclear membrane via a C-terminal C15 farnesyl group (Schafer et al., Annu. Rev. Genet. 30:209-237 (1992)), whereas lamins A and C bind to lamin B. Other integral membrane proteins interact with lamin B and most likely stabilize the membrane attachment of lamins (Furukawa et al., EMBO J. 14:1626-1636 (1995)). Recent studies have also demonstrated the ability of lamins A and B to bind DNA, suggesting a role for mammalian lamins in anchoring chromatin to the nuclear envelope. The interaction between nuclear envelope, lamina, and chromatin is considered to be of fundamental importance for higher order chromosome organization, as well as the assembly and disassembly of the nuclear envelope during mitosis (Furukawa et al., EMBO J. 14:1626-1636 (1995)).
The nuclear matrix is a second structural skeleton that has been biochemically defined as the insoluble component that remains after treatment of isolated nuclei with DNase I and extraction of proteins with high-salt solutions (Berezney et al., Biochem. Biophys. Res. Comm. 60:1410-1417 (1974)) or the chaotropic agent lithium diiodosalicylate (Mirkowitch et al., Cell 39:223-232 (1984)). Chromatin binds to the nuclear matrix via matrix attachment regions (MARs) in the DNA. MARs are generally AT-rich DNA sequences that are several hundred base pairs long and localized to noncoding regions of the DNA, but often flanking genes (Gasser et al., Trends Genet. 3:16-22 (1987)). However, there is no consensus sequence known for MARS. The significance of structural characteristics for MARs such as DNA bending and a narrow minor groove due to oligo(dA) tracts has been previously proposed. MARs have been shown to increase transcriptional activity of a linked gene and to confer position-independent, copy-number dependent expression in stably transfected cells (Phi-Wan et al., EMBO J. 7:655-664 (1988)).
A small number of MAR binding proteins have been identified from animal nuclei, and they are considered to be components of the nuclear matrix (von Kries et al., Cell 64:123-135 (1991); Dickinson et al., Cell 70:631-645 (1 992); Romig et al., EMBO J. 11:3431-3440 (1992); Tsutsui et al., J. Biol. Chem. 268:12886-12894 (1993); Renz et al., Nucleic Acids Res. 24:843-849 (1996); U.S. Pat. No. 5,652,340). In addition, it has been shown that lamins specifically bind to MARs (Luderus et al., Mol. Cell. Biol. 14:6297-6305 (1994)). The specific interaction between DNA and the nuclear matrix/nuclear lamina is most likely an important mechanism for long-range gene regulation and higher order chromatin organization (Gasser et al., Trends Genet. 3:16-22 (1987)).
Most investigations into structural components of the nucleus have focused on proteins in vertebrates and Drosophila. Significantly less information is available for other eukaryotes, and in particular for plants. Proteins that are immunologically related to animal IF proteins and lamins have been detected in pea and carrot nuclei (Beven et al., J. Mol. Biol. 228:41-57 (1991); McNulty et al., J. Cell Sci. 103:407-414 (1992)). Plant nuclear matrix preparations that bind to animal MARs have been reported, suggesting that proteins with similar DNA binding specificities exist in plants as well (Hall et al., Proc. Natl. Acad. Sci. USA 88:9320-9324 (1991)).
Effects of MARs on gene expression in plants have been reported, but have been quite variable. In some experimental systems, no reduction of variability but an increase in expression level has been reported (Breyne et al., Plant Cell 4:463-471 (1992); Allen et al., Plant Cell 5:603-613 (1993); Allen et al., Plant Cell 8:899-913 (1996); U.S. Pat. No. 5,773,689). Other authors have found no significant increase in expression level, but a reduction of variability (van der Geest et al., Plant J. 6:413-423 (1994); Mlynarova et al., Plant Cell 6:417-426 (1994)). It is not clear what causes these observed differences, but they will most probably be due to the fact that MARs establish different molecular interactions, which might either depend on the features of the MAR itself or on the specific molecular environment of the transformed cell/tissue. The routine use of MARs for strategies to improve transgene expression will greatly depend on the characterization of the proteins involved in DNA-nuclear matrix attachment and the factors responsible for the observed increase in gene expression.
Currently, no sequence information is available for plant lamin-like proteins. However, the cloning of the cDNA for a plant MAR-binding protein, MFP1, from tomato has been reported (Meier et al., Plant Cell 8:2105-2115 (1996)). MFP1 has structural features of a filament-like protein and it preferentially binds to MAR DNA sequences from both plants and animals. In contrast to other known MAR binding proteins, MFP1 contains a hydrophobic N-terminal amino acid sequence that might function as a membrane-spanning domain. MFP1, therefore, has features of a novel anchor protein that most likely connects chromatin via MAR DNA with the nuclear envelope and nuclear filament proteins.
In order to routinely use the attachment of transgenes to the nuclear matrix improve gene expression, it will be necessary to further characterize the elements involved in this process and to better understand the underlying mechanisms. Thus, a need exists to identify and characterize additional nuclear matrix proteins. The present invention presents six previously unknown proteins that are localized in the nuclear matrix, bind to a MAR-binding protein or to a protein that binds to a MAR-binding protein, or are able to increase gene expression.
Applicants provide a method for regulating gene expression in a stably transformed transgenic plant cell which comprises combining into the genome of the plant cell:
(a) a first chimeric gene comprising in the 5xe2x80x2 to 3xe2x80x2 direction:
(1) a promoter operably-linked to at least one DNA-binding domain sequence;
(2) a coding sequence or a complement thereof operably-linked to the promoter; and
(3) a polyadenylation signal sequence operably-linked to the coding sequence or a complement thereof;
provided that when the promoter is a minimal promoter then the DNA-binding domain sequence is located upstream of the minimal promoter; and
(b) a second chimeric gene comprising in the 5xe2x80x2 to 3xe2x80x2 direction:
(1) a promoter;
(2) a DNA sequence encoding a DNA-binding domain;
(3) a DNA sequence selected from the group consisting of SEQ ID NO:3 and SEQ ID NO:14 operably-linked to the DNA sequence of (2); and
(4) a polyadenylation signal sequence operably-linked to the DNA sequence of (3),
wherein the expression of the second chimeric gene regulates expression of the first chimeric gene.
Applicants also provide a further method for regulating gene expression in a stably transformed transgenic plant cell which comprises (a) transforming the genome of the plant cell with:
(1) a chimeric gene comprising in the 5 xe2x80x2 to 3xe2x80x2 direction:
(i) a promoter operably-linked to at least one DNA-binding domain sequence;
(ii) a coding sequence or a complement thereof operably-linked to the promoter; and
(iii) a polyadenylation signal sequence operably-linked to the coding sequence or a complement thereof;
provided that when the promoter is a minimal promoter then the DNA-binding domain sequence is located upstream of the minimal promoter, and
(b) infecting the plant produced in (a) with a viral vector comprising:
(1) a promoter;
(2) a DNA sequence encoding a DNA-binding domain;
(3) a DNA sequence selected from the group consisting of SEQ ID NO:3 and SEQ ID NO:14 operably-linked to the DNA sequence of (2); and
(4) a polyadenylation signal sequence operably-linked to the DNA sequence of (3);
wherein the expression of the viral vector regulates expression of the chimeric gene of (a). In this method, the preferred DNA-binding domain of (a)(1)(i) is a GAL4 binding domain. Also part of these two method inventions are transformed plants having at least one gene whose expression is regulated using either of these two methods. In the non-viral method, the invention additionally includes seeds obtained from the plants so transformed.
Applicants also provide as part of the invention certain isolated nucleic acids molecules. The isolated nucleic acid molecules encompassed in the invention are those encoding plant MFP1-binding proteins and those encoding plant MAF1-binding proteins.
The invention more specifically encompasses an isolated nucleic acid molecule encoding a plant MFP1-binding protein selected from the group consisting of:
(a) an isolated nucleic acid molecule encoding the amino acid sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35 and SEQ ID NO:37;
(b) an isolated nucleic acid molecule that hybridizes with (a) under the following hybridization conditions: 0.1xc3x97SSC, 0.1% SDS at 65xc2x0 C.; and
(c) an isolated nucleic acid molecule that is completely complementary to (a) or (b).
The invention also encompasses the isolated nucleic acid molecule encoding a plant MAF1-binding protein selected from the group consisting of:
(a) an isolated nucleic acid molecule encoding the amino acid sequence selected from the group consisting of SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15 and SEQ ID NO:17;
(b) an isolated nucleic acid molecule that hybridizes with (a) under the following hybridization conditions: 0.1xc3x97SSC, 0.1% SDS at 65xc2x0 C.; and
(c) an isolated nucleic acid molecule that is completely complementary to (a) or (b).
The invention further encompasses the polypeptides respectively encoded by the isolated nucleic acid molecule described above for MPF1-binding protein or by the isolated nucleic acid molecule described above for MAF1-binding protein. The preferred polypeptides are those having at least 50% identity with the amino acid sequences identified by the SEQ ID NOs 2 and 4 for the MPF1-binding protein and having at least 95% identity with the amino acid sequences identified by the SEQ ID NOs specified above for the MAF1-binding protein, respectively.
The invention also encompasses chimeric genes comprising (1) the isolated nucleic acid molecule described above encoding the MPF1-binding protein or by the isolated nucleic acid molecule described above encoding the MAF1-binding protein operably-linked to (2) suitable regulatory sequences. The invention also encompasses host cells transformed with each of the chimeric genes described above. In both cases the host cell is preferably a plant cell or E. coli. 
Applicants also provide a method of altering the level of expression of binding protein in a host cell comprising:
(a) transforming a host cell with a chimeric gene comprising the isolated nucleic acid molecule described above for either MFP1-binding protein or for MAF1-binding protein, respectively; and
(b) growing the transformed host cell of step (a) under conditions that are suitable for expression of particular chimeric gene,
resulting in production of altered levels of the particular binding protein in the transformed host cell relative to expression levels of an untransformed host cell.
Applicants further provide a method of obtaining a nucleic acid molecule encoding all or a substantial portion of an amino acid sequence encoding either a MFP1-binding protein or a MAF1-binding protein comprising:
(a) probing a cDNA or genomic library with the nucleic acid molecule described above corresponding to either the MPF1-binding protein or the MAF1-binding protein;
(b) identifying a DNA clone that hybridizes with the nucleic acid molecule used as a probe in (a); and
(c) sequencing the cDNA or genomic fragment that comprises the clone identified in step (b),
wherein the sequenced cDNA or genomic fragment encodes all or substantially all of the amino acid sequence encoding the particular binding protein. The invention further encompasses the products of this method.
Applicants further provide a method of obtaining a nucleic acid molecule encoding all or a substantial portion of the amino acid sequence encoding either a MFP1-binding protein or a MAF1-binding protein comprising:
(a) synthesizing an oligonucleotide primer corresponding to a portion of (1) the sequence selected from the group consisting of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34 and SEQ ID NO:36 or (2) the sequence selected from the group consisting of SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14 and SEQ ID NO:16; and
(b) amplifying a cDNA insert present in a cloning vector using the oligonucleotide primer of step (a) and a primer representing sequences of the cloning vector,
wherein the amplified cDNA insert encodes a portion of an amino acid sequence encoding a plant MFP1-binding protein or encodes a portion of an amino acid sequence encoding a plant MAF1-binding protein. The invention further includes the products obtained by this method.
Applicants also provide a method for evaluating at least one chemical compound for its ability to inhibit the activity of a plant MFP1-binding protein, comprising the steps of:
(a) contacting at least one chemical compound with a host cell, to form a test system, the host cell comprising:
(i) a first hybrid protein comprising a first protein fused to a DNA binding domain of a transcriptional activator;
(ii) a second hybrid protein comprising a second protein fused to an activation domain of a transcriptional activator, and
(iii) a reporter gene,
wherein the first or second protein is encoded by MFP1, wherein the remaining first or second protein is encoded by the nucleic acid fragment described above encoding a plant MFP1-binding protein and wherein the second hybrid protein binds to the first hybrid protein which allows activation of the reporter gene;
(b) incubating the test system for a suitable time to permit inhibition of the reporter gene;
(c) monitoring the expression of the reporter gene of step (b); and
(d) evaluating at least one compound for its ability to inhibit the activity of a plant MFP1-binding protein on the basis of the level of reporter gene expression of step (c).
Furthermore, this evaluation method also encompasses a method for evaluating at least one compound for its ability to inhibit the activity of a plant MAF1-binding protein, comprising the steps of:
(a) contacting at least one chemical compound with a host cell, to form a test system, the host cell comprising:
(i) a first hybrid protein comprising a first protein fused to a DNA binding domain of a transcriptional activator;
(ii) a second hybrid protein comprising a second protein fused to an activation domain of a transcriptional activator, and
(iii) a reporter gene,
wherein the first or second protein is encoded by the nucleic acid moleucle encoding a plant MAF1-binding protein as described above, and wherein the second hybrid protein binds to the first hybrid protein which allows activation of the reporter gene;
(b) incubating the test system for a suitable time to permit inhibition of the reporter gene;
(c) monitoring the expression of the reporter gene of step (b); and
(d) evaluating at least one compound for its ability to inhibit the activity of a plant MAF1-binding protein on the basis of the level of reporter gene expression of step (c).
With regard to plant MFP1-binding protein in the evaluation method, the preferred nucleic acid molecule is selected from the group consisting of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34 and SEQ ID NO:36 and the MFP1-binding protein is selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35 and SEQ ID NO:37. With regard to the plant MAF1-binding protein in the evaluation method, the preferred nucleic acid fragment is selected from the group consisting of SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14 and SEQ ID NO:16 and the MFP1-binding protein is selected from the group consisting of SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15 and SEQ ID NO:17.
The following sequence descriptions and sequence listings attached hereto comply with the rules governing nucleotide and/or amino acid sequence disclosures in patent applications as set forth in 37 C.F.R. xc2xa71.821-1.825. The Sequence Descriptions contain the one letter code for nucleotide sequence characters and the three letter codes for amino acids as defined in conformity with the IUPAC-IYUB standards described in Nucleic Acids Research 13:3021-3030 (1985) and in the Biochemical Journal 219(2):345-373 (1984) which are herein incorporated by reference. The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. xc2xa71.822. The present invention utilized Wisconsin Package Version 9.0 software from Genetics Computer Group (GCG), Madison, Wis.
SEQ ID NO:1 is the nucleotide sequence of MAF1.
SEQ ID NO:2 is the deduced amino acid sequence of MAF1.
SEQ ID NO:3 is the nucleotide sequence of NMP1.
SEQ ID NO:4 is the deduced amino acid sequence of NMP1.
SEQ ID NO:5 is the consensus sequence for a GAL4 binding site.
SEQ ID NO:6 and SEQ ID NO:7 are the oligonucleotides used to form the GAL4 binding site cassette described in Example 2.
SEQ ID NO:8 is the nucleotide sequence of FLIP1.
SEQ ID NO:9 is the deduced amino acid sequence of FLIP1.
SEQ ID NO:10 is the nucleotide sequence of FLIP2.
SEQ ID NO:11 is the deduced amino acid sequence of FLIP2.
SEQ ID NO:12 is the nucleotide sequence of FLIP3.
SEQ ID NO:13 is the deduced amino acid sequence of FLIP3.
SEQ ID NO:14 is the nucleotide sequence of FLIP4.
SEQ ID NO:15 is the deduced amino acid sequence of FLIP4.
SEQ ID NO:16 is the nucleotide sequence of pD1.
SEQ ID NO:17 is the deduced amino acid sequence of pD1.
SEQ ID NO:18 is the full cDNA sequence in clone cta1n.pk0074.f12 encoding MAF1.
SEQ ID NO:19 is the deduced amino acid sequence of a corn MAF1 derived from the nucleotide sequence of SEQ ID NO:18.
SEQ ID NO:20 is the full cDNA sequence in clone ss1.pk0021.e2 encoding MAF1.
SEQ ID NO:21 is the deduced amino acid sequence of a soybean MAF1 derived from the nucleotide sequence of SEQ ID NO:20.
SEQ ID NO:22 is the full cDNA sequence in clone se1.pk0050.g5 encoding MAF1.
SEQ ID NO:23 is the deduced amino acid sequence of a soybean MAF1 derived from the nucleotide sequence of SEQ ID NO:22.
SEQ ID NO:24 is the nucleotide sequence comprising a portion of the cDNA insert in clone wle1n.pk0104.e10 encoding MAF1.
SEQ ID NO:25 is the deduced amino acid sequence of a wheat MAF1 derived from the nucleotide sequence of SEQ ID NO:24.
SEQ ID NO:26 is the nucleotide sequence comprising a portion of the cDNA insert in clone ect1c.pk001.11 encoding MAF1.
SEQ ID NO:27 is the deduced amino acid sequence of a Canna edulis MAF1 derived from the nucleotide sequence of SEQ ID NO:26.
SEQ ID NO:28 is the nucleotide sequence comprising a portion of the cDNA insert in clone pps.pk0009.b7 encoding MAF1.
SEQ ID NO:29 is the deduced amino acid sequence of a Picramnia pentandra MAF1 derived from the nucleotide sequence of SEQ ID NO:28.
SEQ ID NO:30 is the full cDNA sequence in clone cbn2.pk0003.a12 encoding NMP1.
SEQ ID NO:31 is the deduced amino acid sequence of a corn NMP1 derived from the nucleotide sequence of SEQ ID NO:30.
SEQ ID NO:32 is the nucleotide sequence comprising a portion of the cDNA insert in clone wr1.pk0025.c2 encoding NMP1.
SEQ ID NO:33 is the deduced amino acid sequence of a wheat NMP1 derived from the nucleotide sequence of SEQ ID NO:32.
SEQ ID NO:34 is the nucleotide sequence comprising a portion of the cDNA insert in clone ph1t.pk0024.h5 encoding NMP1.
SEQ ID NO:35 is the deduced amino acid sequence of a Phaseolus lunatus NMP1 derived from the nucleotide sequence of SEQ ID NO:34.
SEQ ID NO:36 is the nucleotide sequence comprising a portion of the cDNA insert in clone bsh1.pk0011.e4 encoding NMP1.
SEQ ID NO:37 is the deduced amino acid sequence of a barley NMP1 derived from the nucleotide sequence of SEQ ID NO:36.
SEQ ID NO:38 is a primer used for the PCR amplification of the NMP1 open reading frame from the plasmid pAD 6-6.
SEQ ID NO:39 is a primer used for the PCR amplification of the NMP1 open reading frame from the plasmid pAD 6-6.
Applicants made the following biological deposits under the terms of the Budapest Treaty on the International Recognition of the Deposit of Micro-organisms for the Purposes of Patent Procedure:
As used herein, xe2x80x9cATCCxe2x80x9d refers to the American Type Culture Collection international depository located at 10801 University Boulevard, Manassas, Va., 20110-2209, U.S.A. The xe2x80x9cATCC No.xe2x80x9d is the accession number to cultures on deposit with the ATCC.