The present invention relates to genes encoding enzymes involved the biosynthesis of biodegradable thermoplastics known as polyhydroxyalkanoates.
The production of intracellular polyesters belonging to the class of polymers known as polyhydroxyalkanoates (PHAs) has been observed in a wide array of prokaryotic organisms. PHAs are bacterial polyesters that accumulate in a wide variety of bacteria. The polymers are biodegradable and are an attractive source of nonpolluting plastics and elastomers. The monomers of the polyesters range in length from C4 to C12. PHAs are broadly characterized according to the monomers that constitute their backbone.
PHA synthase genes have been characterized from about 30 bacteria. The genes can be divided into two classes based upon the substrate specificity towards 3-hydroxyalkanoate-CoA. Class I accepts short-chain-length (SCL) 3-hydroxyalkanoate-CoA from about C4 to about C6 and class II accepts medium-chain-length (MSL) 3-hydroxyalkanoate-CoA from about C6 to about C4. Only a few exceptions exist. For example, a PHA synthase from Thiocapsa pfennigii can produce PHA from C4 to C8 (Liebergesell et al. 1993; WO 96/08566) and a PHA synthase from Pseudomonas sp. 61-3 can synthesize PHA from C4 to C12 (Matsusaki et al. (1998) J. Bacteriol. 180:6459-6467).
Lee et al. ((1995) Appl. Microbiol. Biotechnol. 42:901) compared PHA production of several Pseudomonas strains. Pseudomonas fluorescens strain GK13 and Pseudomonas sp. A33 showed an unusual poly(3HBcoX) composition pattern. With 1,3-butanediol as a carbon source, this strain produced PHA with a composition of 15.1 mol % 3HB (3-hydroxybutyric acid), 3.5 mol % 3HHx (3-hydroxyhexanoate), 15.7 mol % 3HO (3-hydroxyoctanoate) and 65.7 mol % 3HD(3-hydroxydecanoate). P. fluorescens strain GK13 and Pseudomonas sp. A33 showed identical hybridization patterns when restricted DNA was hybridized employing labeled oligonucleotide probe highly specific for PHA synthases. A 12.5-kbp genomic EcoRI fragment from Pseudomonas sp. A33 conferred the ability for poly(3HBcoX) synthesis to a PHA negative mutant of Ralstonia eutropha. With gluconate as a carbon source, the transformed Ralstonia strain produced PHA with a composition of 89.9 mol % 3HB and 10.1 mol % 3HD. Based upon the similarities of strain A33 and GK13 concerning PHA synthesis and hybridization pattern, a 12.5-kbp genomic EcoRI fragment from strain GK13 most probably encodes for a PHA synthase, which is able to synthesize poly(3HBcoX). The only other example of a poly(3HBcoX)-synthesizing PHA synthase was reported by Matsusaki et al. ((1998) J. Bacteriol. 180:6459-6467).
The polymerization of the hydroxyacyl-CoA substrates is carried out by PHA synthases. The substrate specificity of this class of enzymes varies across the spectrum of PHA producing organisms. The variation in substrate specificity of PHA synthases is supported by indirect evidence observed in heterologous expression studies (Lee et al. (1995) Appl. Microbiol. Biotechnol. 42:901 and Timm et al. (1990) Appl. Microbiol. Biotech. 33:296). Hence, the structure of the backbone of the polymer is strongly influenced by the PHA synthase responsible for its formation.
Compositions and methods for the production of PHA in plants and host cells are provided. Particularly, isolated nucleotide molecules comprising nucleotide sequences encoding PHA synthases with broad substrate specificity are disclosed from Pseudomonas fluorescens strain GK13 (DSM7139). Additionally provided are isolated polypeptides comprising the amino acid sequences of such PHA synthases. The nucleotide molecules of the invention can be used to produce, in plants and other organisms, poly(3HBcoX), where X has an acyl chain length of greater than or equal to C8. The PHA synthases of the invention can be targeted to the peroxisomes in plants by operably linking peroxisomal targeting sequences to the nucleotide sequences encoding the PHA synthases. In this manner, the invention provides for the production of PHA copolymers in plant peroxisomes. The nucleotide sequences of the invention can be used in combination with other sequences for the production of novel biodegradable polyesters in plants.
Transformed host cells, plants, plant tissues, plant cells and seeds are provided.
Compositions and methods for the production of biodegradable polyesters in plants and other organisms are provided. In particular, isolated nucleotide molecules comprising nucleotide sequences for PHA synthase genes, particularly, phaC1 and phaC2 from Pseudomonas fluorescens GK13, are provided (SEQ ID NOs: 1 and 3, respectively). The sequences find use in plants and other organisms for the production of PHA, particularly PHA copolymers, more particularly poly(3HBcoX). By xe2x80x9cpoly(3HBcoX)xe2x80x9d is intended a PHA copolymer comprised of 3-hydroxybutyrate (3HB) and any other hydroxyalkanoate, designated herein as X.
The nucleotide sequences of the invention can be used in combination with other sequences including, but not limited to nucleotide sequences encoding xcex2-ketothiolase, acetoacetyl-CoA reductase, the R-specific enoyl-CoA hydratase domain of the yeast multifunctional protein (MFP), enoyl-CoA hydratase, and 3-hydroxyacyl-ACP CoA-transferase (phaG). The sequences can be provided with peroxisome-targeting sequences for targeting to the peroxisomes. Also provided are isolated polypeptides encoded by such nucleotide sequences (SEQ ID NOs: 2 and 4).
Methods are provided for producing PHA in host cells. The methods involve transforming a host cell with a nucleotide molecule of the invention encoding a PHA synthase. Such host cells find use in the production of biodegradable thermoplastics. The methods additionally comprise growing the host cells for a sufficient length of time in conditions favorable for the production of PHA. The methods further involve extracting the PHA from the host cells or from the vicinity of the host cells, such as for example, a culture broth or solid medium. Preferred host cells include plant cells, bacterial cells, yeast cells, cells of non-yeast fungi, insect cells, algal cells and animal cells such as, for example, insect cells and nematode cells. The host cells of the invention can be single cells, colonies or clumps of cells, or cells within a multicellular structure or within an organism.
Methods for producing PHB in the cytosol or plastids of plants and for producing PHA in plant peroxisomes are known in the art. While the nucleotide sequences of the present invention can be used in such methods for producing PHA in plants, such methods are not known to achieve the synthesis of high levels of PHA in plants. In particular, the nucleotide sequences of the present invention find use in improved methods for producing PHA in plants, particularly in plant peroxisomes, as described in U.S. Provisional Application Serial No. 60/156,807 filed Sep. 29, 1999; herein incorporated by reference.
Methods for producing PHA in plants are provided. The methods involve genetically manipulating the genome of a plant to produce PHA. The invention encompasses plants and seeds thereof, that have been genetically manipulated to produce enzymes involved in PHA synthesis and expression cassettes containing coding sequences for such enzymes. The invention further encompasses genetically manipulated plant cells and plant tissues.
The methods for producing PHA in plants involve genetically manipulating the plant to produce at least one enzyme in the PHA biosynthetic pathway. The plants of the invention each comprise in their genomes at least one stably incorporated DNA construct, each DNA construct comprising a coding sequence for an enzyme involved in PHA synthesis operably linked to a promoter that drives the expression of a gene in a plant. Plants of the invention are genetically manipulated to produce a PHA synthase of the invention. Such PHA synthases can catalyze the synthesis of copolymers.
DNA constructs of the invention comprise a coding sequence for a enzyme involved in PHA synthesis. For expression in plants, the DNA construct further comprises an operably linked promoter that drives expression in a plant cell. Preferably, the promoters are selected from seed-preferred promoters, chemical-regulatable promoters, germination-preferred promoters and leaf-preferred promoters. If necessary for directing the encoded proteins to the peroxisome, the DNA construct can also include an operably linked peroxisome-targeting signal sequence.
It is recognized that for producing high levels of PHA copolymers in certain plants, particularly in their peroxisomes, it may be necessary to genetically manipulate plants to produce additional enzymes involved in PHA synthesis. Generally, the additional enzymes are directed to the peroxisome to increase the synthesis of at least one intermediate molecule. For example, such an intermediate molecule can be the substrate for a PHA synthase including, but not limited to, an R-(xe2x88x92)-3-hydroxyacyl-CoA. The methods of the invention comprise genetically modifying plants to produce, in addition to the PHA synthase described supra, one, two, three, four, five or more additional enzymes involved in PHA synthesis. In one embodiment of the invention, each DNA construct comprising the coding sequence of one of these additional enzymes is operably linked to both a promoter that drives expression in a plant and a nucleotide sequence encoding a peroxisome-targeting signal sequence. Depending on the plant, the addition of one or more of these enzymes may be necessary to achieve high-level PHA synthesis in the plant. The additional enzymes include, but are not limited to, an enzyme that catalyzes the synthesis of R-(xe2x88x92)-3-hydroxyacyl-CoA, a 3-ketoacyl-CoA reductase and an acetyl-CoA:acetyl transferase.
Additionally, the plant of the invention can comprise in its genome a DNA construct comprising a coding sequence for a second PHA synthase. Preferably, the second PHA synthase is capable of synthesizing PHB. Preferred second PHA synthases include those encoded by nucleotide sequences isolatable from Ralstonia eutropha (GenBank Accession No. J05003), Acinetobacter sp. (GenBank Accession No. U04848), Alcaligenes latus (GenBank Accession No. AF078795), Azorhizobium caulinodans (EMBL Accession No. AJ006237), Comamonas acidovorans (DDBJ Accession No. AB009237), Methylobacterium extorquens (GenBank Accession No. L07893), Paracoccus denitrificans (DDBJ Accession No. D43764) and Zoogloea ramigera (GenBank Accession No. U66242)
The methods of the invention additionally comprise growing the plant under conditions favorable for PHA production, harvesting the plant, or one or more parts thereof which contain PHA therein, and isolating the PHA from the plant or part thereof. Such parts include, but are not limited to, seeds, leaves, stems, roots, fruits and tubers. The PHA can be isolated or extracted from the plant or part thereof by methods known in the art. See, U.S. Pat. Nos. 5,942,597; 5,918,747; 5,899,339; 5,849,854 and 5,821,299; herein incorporated by reference. See also, EP 859858A1, WO 97/07229, WO 97/07230 and WO 97/15681; herein incorporated by reference.
Preferred 3-ketoacyl-CoA reductases of the invention are those that utilize NADH and include, but are not limited to, at least a portion of one of the multifunctional proteins from yeast (GenBank Accession No. M86456, SEQ ID NO: 9) and rat (GenBank Accession No. U37486, SEQ ID NO: 10) wherein such a portion comprises a 3-ketoacyl-CoA reductase domain. However, in the methods of the invention, NADPH-dependent 3-ketoacyl-CoA reductases can also be employed including, but not limited to, the 3-ketoacyl-CoA reductase encoded by GenBank Accession No. J04987 (SEQ ID NO: 11).
Acetyl-CoA:acetyl transferases of the invention include, but are not limited to a radish acetyl-CoA:acetyl transferase encoded by the nucleotide sequence having EMBL Accession No. X78116 (SEQ ID NO: 12).
If necessary to increase the level of NADPH in the peroxisome, the methods of the invention can additionally involve, stably integrating into the genome of a plant a DNA construct comprising a nucleotide sequence encoding a NADH kinase or an NAD+ kinase and an operably linked promoter that drives expression in a plant cell. Such NADH and NAD+ kinases catalyze the synthesis of NADPH and NADP+, respectively. Nucleotide sequences encoding such kinases include, but are not limited to, DDJB Accession No. E13102 (SEQ ID NO: 13) and EMBL Accession Nos. Z73544 (SEQ ID NO: 14) and X84260 (SEQ ID NO: 15). The fourth construct can additionally comprise an operably linked peroxisome-targeting signal sequence. By targeting such NADH and NAD+ kinases to the peroxisome, the level of NADPH and NADP+ can be increased in the plant peroxisome for use by enzymes, such as, for example, an NADPH-dependent 3-ketoacyl-CoA reductase.
In one embodiment of the invention, acetyl-CoA from the xcex2-oxidation pathway can be converted to 3-hydroxybutyryl-CoA by coexpression in a plant of a bacterial xcex2-ketothiolase and acetoacetyl-CoA reductase (e.g. from Ralstonia eutropha). The precursor of the X component from poly(3HBcoX), R-3-hydroxyacyl-CoA, can be converted from 2-enoyl-CoA by expression of, for example, the R-specific enoyl-CoA hydratase domain of the yeast multifunctional protein (MFP) or a related protein from maize (see U.S. Provisional Application Serial No. 60/156,807 filed Sep. 29, 1999), or by expression of the enoyl-CoA hydratase from Aeromonas caviae (DDBJ Accession No. E15860, SEQ ID NO: 16). Alternatively, 3-hydroxybutyryl-CoA as well as 3-hydroxyacyl-CoA can be provided as precursor for poly(3HBcoX) synthesis in a plant by expression of 3-hydroxyacyl-ACP CoA-transferase (phaG, e.g. from Pseudomonas putida). The unusually broad substrate specificity of the Pseudomonas fluorescens strain GK13 PHA synthase allows synthesis of PHBcoX in peroxisomes.
The transformed plants and host cells of the invention produce PHA, preferably PHA copolymers, more preferably poly(3HBcoX), most preferably poly(3HBcoX) wherein X has an acyl chain length of greater than or equal to C8.
Compositions of the invention include isolated nucleotide molecules encoding PHA synthases that are involved in PHA synthesis. In particular, the present invention provides for isolated nucleic acid molecules comprising nucleotide sequences encoding the amino acid sequences shown in SEQ ID NOs: 2 and 4. Further provided are polypeptides having an amino acid sequence encoded by a nucleic acid molecule described herein, for example those set forth in SEQ ID NOs: 1 and 3, and fragments and variants thereof.
The invention encompasses isolated or substantially purified nucleic acid or protein compositions. An xe2x80x9cisolatedxe2x80x9d or xe2x80x9cpurifiedxe2x80x9d nucleic acid molecule or protein, or biologically active portion thereof, is substantially free of other cellular material, or culture medium when produced by recombinant techniques, or substantially free of chemical precursors or other chemicals when chemically synthesized. Preferably, an xe2x80x9cisolatedxe2x80x9d nucleic acid is free of sequences (preferably protein encoding sequences) that naturally flank the nucleic acid (i.e., sequences located at the 5xe2x80x2 and 3xe2x80x2 ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. For example, in various embodiments, the isolated nucleic acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of nucleotide sequences that naturally flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is derived. A protein that is substantially free of cellular material includes preparations of protein having less than about 30%, 20%, 10%, 5%, (by dry weight) of contaminating protein. When the protein of the invention or biologically active portion thereof is recombinantly produced, preferably culture medium represents less than about 30%, 20%, 10%, or 5% (by dry weight) of chemical precursors or non-protein-of-interest chemicals.
Fragments and variants of the disclosed nucleotide sequences and proteins encoded thereby are also encompassed by the present invention. By xe2x80x9cfragmentxe2x80x9d is intended a portion of the nucleotide sequence or a portion of the amino acid sequence and hence protein encoded thereby. Fragments of a nucleotide sequence may encode protein fragments that retain the biological activity of the native protein. Alternatively, fragments of a nucleotide sequence that are useful as hybridization probes generally do not encode fragment proteins retaining biological activity. Thus, fragments of a nucleotide sequence may range from at least about 20 nucleotides, about 50 nucleotides, about 100 nucleotides, and up to the full-length nucleotide sequence encoding the proteins of the invention.
A fragment of a PHA synthase nucleotide sequence that encodes a biologically active portion of a PHA synthase protein of the invention will encode at least 15, 25, 30, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500 or 550 contiguous amino acids, or up to the total number of amino acids present in a full-length PHA synthase protein of the invention (for example, 559 and 560 amino acid for SEQ ID NOs: 2 and 4, respectively). Fragments of a PHA synthase nucleotide sequence that are useful as hybridization probes or PCR primers generally need not encode a biologically active portion of a PHA synthase.
Thus, a fragment of a PHA synthase nucleotide sequence may encode a biologically active portion of a PHA synthase, or it may be a fragment that can be used as a hybridization probe or PCR primer using methods disclosed below. A biologically active portion of a PHA synthase can be prepared by isolating a portion of one of the PHA synthase nucleotide sequences of the invention, expressing the encoded portion of the a PHA synthase (e.g., by recombinant expression in vitro), and assessing the activity of the encoded portion of the a PHA synthase. Nucleic acid molecules that are fragments of a PHA synthase nucleotide sequence comprise at least 16, 20, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 800, 900, 1,000, 1,100, 1,200, 1,300, or 1,400, 1,500 or 1,600 nucleotides, or up to the number of nucleotides present in a full-length PHA synthase nucleotide sequence disclosed herein (for example, 1680 and 1683 nucleotides for SEQ ID NOs: 1 and 3, respectively).
By xe2x80x9cvariantsxe2x80x9d is intended substantially similar sequences. For nucleotide sequences, conservative variants include those sequences that, because of the degeneracy of the genetic code, encode the amino acid sequence of one of the PHA synthase polypeptides of the invention. Naturally occurring allelic variants such as these can be identified with the use of well-known molecular biology techniques, as, for example, with polymerase chain reaction (PCR) and hybridization techniques as outlined below. Variant nucleotide sequences also include synthetically derived nucleotide sequences, such as those generated, for example, by using site-directed mutagenesis but which still encode a PHA synthase protein of the invention. Generally, variants of a particular nucleotide sequence of the invention will have at least about 40%, 50%, 60%, 65%, 70%, generally at least about 75%, 80%, 85%, preferably at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, and more preferably at least about 98%, 99% or more sequence identity to that particular nucleotide sequence as determined by sequence alignment programs described elsewhere herein using default parameters.
By xe2x80x9cvariantxe2x80x9d protein is intended a protein derived from the native protein by deletion (so-called truncation) or addition of one or more amino acids to the N-terminal and/or C-terminal end of the native protein; deletion or addition of one or more amino acids at one or more sites in the native protein; or substitution of one or more amino acids at one or more sites in the native protein. Variant proteins encompassed by the present invention are biologically active, that is they continue to possess the desired biological activity of the native protein. Such variants may result from, for example, genetic polymorphism or from human manipulation. Biologically active variants of a native PHA synthase protein of the invention will have at least 40%, 50%, 60%, 65%, 70%, generally at least about 75%, 80%, 85%, preferably at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, and more preferably at least about 98%, 99% or more sequence identity to the amino acid sequence for the native protein as determined by sequence alignment programs described elsewhere herein using default parameters. A biologically active variant of a protein of the invention may differ from that protein by as few as 1-15 amino acid residues, as few as 1-10, such as 6-10, as few as 5, as few as 4, 3, 2, or even 1 amino acid residue.
The proteins of the invention may be altered in various ways including amino acid substitutions, deletions, truncations, and insertions. Methods for such manipulations are generally known in the art. For example, amino acid sequence variants of PHA synthase proteins can be prepared by mutations in the DNA. Methods for mutagenesis and nucleotide sequence alterations are well known in the art. See, for example, Kunkel (1985) Proc. Natl. Acad. Sci. USA 82:488-492; Kunkel et al. (1987) Methods in Enzymol. 154:367-382; U.S. Pat. No. 4,873,192; Walker and Gaastra, eds. (1983) Techniques in Molecular Biology (MacMillan Publishing Company, New York) and the references cited therein. Guidance as to appropriate amino acid substitutions that do not affect biological activity of the protein of interest may be found in the model of Dayhoff et al. (1978) Atlas of Protein Sequence and Structure (Natl. Biomed. Res. Found., Washington, D.C.), herein incorporated by reference. Conservative substitutions, such as exchanging one amino acid with another having similar properties, may be preferred.
Thus, the genes and nucleotide sequences of the invention include both the naturally occurring sequences as well as mutant forms. Likewise, the proteins of the invention encompass both naturally occurring proteins as well as variations and modified forms thereof. Such variants will continue to possess the desired activity. Obviously, the mutations that will be made in the DNA encoding the variant must not place the sequence out of reading frame and preferably will not create complementary regions that could produce secondary mRNA structure. See, EP Patent Application Publication No. 75,444.
The deletions, insertions, and substitutions of the protein sequences encompassed herein are not expected to produce radical changes in the characteristics of the protein. However, when it is difficult to predict the exact effect of the substitution, deletion, or insertion in advance of doing so, one skilled in the art will appreciate that the effect will be evaluated by routine screening assays. That is, the activity can be evaluated by PHA synthase activity assays. See, for example, Schubert et aL (1988) J. Bacteriol. 170:5837-5847, and Valentin and Steinbuechel (1994) Appl. Microbiol. Biotechnol. 40:699-709; herein incorporated by reference.
Variant nucleotide sequences and proteins also encompass sequences and proteins derived from a mutagenic and recombinogenic procedure such as DNA shuffling. With such a procedure, one or more different PHA synthase coding sequences can be manipulated to create a new PHA synthase possessing the desired properties. In this manner, libraries of recombinant polynucleotides are generated from a population of related sequence polynucleotides comprising sequence regions that have substantial sequence identity and can be homologously recombined in vitro or in vivo. For example, using this approach, sequence motifs encoding a domain of interest may be shuffled between the PHA synthase gene of the invention and other known PHA synthase genes to obtain a new gene coding for a protein with an improved property of interest, such as an increased Km in the case of an enzyme. Strategies for such DNA shuffling are known in the art. See, for example, Stemmer (1994) Proc. Nat. Acad. Sci. USA 91:10747-10751; Stemmer (1994) Nature 370:389-391; Crameri et al. (1997) Nature Biotech. 15:436-438; Moore et al. (1997) J. Mol. Biol. 272:336-347; Zhang et al. (1997) Proc. Natl. Acad. Sci. USA 94:4504-4509; Crameri et al. (1998) Nature 391:288-291; and U.S. Pat. Nos. 5,605,793 and 5,837,458.
The nucleotide sequences of the invention can be used to isolate corresponding sequences from other organisms, particularly other bacteria. In this manner, methods such as PCR, hybridization, and the like can be used to identify such sequences based on their sequence homology to the sequences set forth herein. Sequences isolated based on their sequence identity to the entire PHA synthase sequences set forth herein or to fragments thereof are encompassed by the present invention. Such sequences include sequences that are orthologs of the disclosed sequences. By xe2x80x9corthologsxe2x80x9d is intended genes derived from a common ancestral gene and which are found in different species as a result of speciation. Genes found in different species are considered orthologs when their nucleotide sequences and/or their encoded protein sequences share substantial identity as defined elsewhere herein. Functions of orthologs are often highly conserved among species.
In a PCR approach, oligonucleotide primers can be designed for use in PCR reactions to amplify corresponding DNA sequences from cDNA or genomic DNA extracted from any organism of interest. Methods for designing PCR primers and PCR cloning are generally known in the art and are disclosed in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.). See also Innis et al., eds. (1990) PCR Protocols: A Guide to Methods and Applications (Academic Press, New York); Innis and Gelfand, eds. (1995) PCR Strategies (Academic Press, New York); and Innis and Gelfand, eds. (1999) PCR Methods Manual (Academic Press, New York). Known methods of PCR include, but are not limited to, methods using paired primers, nested primers, single specific primers, degenerate primers, gene-specific primers, vector-specific primers, partially-mismatched primers, and the like.
In hybridization techniques, all or part of a known nucleotide sequence is used as a probe that selectively hybridizes to other corresponding nucleotide sequences present in a population of cloned genomic DNA fragments or cDNA fragments (i.e., genomic or cDNA libraries) from a chosen organism. The hybridization probes may be genomic DNA fragments, cDNA fragments, RNA fragments, or other oligonucleotides, and may be labeled with a detectable group such as 32P, or any other detectable marker. Thus, for example, probes for hybridization can be made by labeling synthetic oligonucleotides based on the PHA synthase sequences of the invention. Methods for preparation of probes for hybridization and for construction of cDNA and genomic libraries are generally known in the art and are disclosed in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.).
For example, an entire PHA synthase sequence disclosed herein, or one or more portions thereof, may be used as a probe capable of specifically hybridizing to corresponding PHA synthase sequences and messenger RNAs. To achieve specific hybridization under a variety of conditions, such probes include sequences that are unique among PHA synthase sequences and are preferably at least about 10 nucleotides in length, and most preferably at least about 20 nucleotides in length. Such probes may be used to amplify corresponding PHA synthase sequences from a chosen organism by PCR. This technique may be used to isolate additional coding sequences from a desired organism or as a diagnostic assay to determine the presence of coding sequences in an organism. Hybridization techniques include hybridization screening of plated DNA libraries (either plaques or colonies; see, for example, Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.).
Hybridization of such sequences may be carried out under stringent conditions. By xe2x80x9cstringent conditionsxe2x80x9d or xe2x80x9cstringent hybridization conditionsxe2x80x9d is intended conditions under which a probe will hybridize to its target sequence to a detectably greater degree than to other sequences (e.g., at least 2-fold over background). Stringent conditions are sequence-dependent and will be different in different circumstances. By controlling the stringency of the hybridization and/or washing conditions, target sequences that are 100% complementary to the probe can be identified (homologous probing). Alternatively, stringency conditions can be adjusted to allow some mismatching in sequences so that lower degrees of similarity are detected (heterologous probing). Generally, a probe is less than about 1000 nucleotides in length, preferably less than 500 nucleotides in length.
Typically, stringent conditions will be those in which the salt concentration is less than about 1.5 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30xc2x0 C. for short probes (e.g., 10 to 50 nucleotides) and at least about 60xc2x0 C. for long probes (e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. Exemplary low stringency conditions include hybridization with a buffer solution of 30 to 35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulphate) at 37xc2x0 C., and a wash in 1xc3x97 to 2xc3x97SSC (20xc3x97SSC=3.0 M NaCl/0.3 M trisodium citrate) at 50 to 55xc2x0 C. Exemplary moderate stringency conditions include hybridization in 40 to 45% formamide, 1.0 M NaCl, 1% SDS at 37xc2x0 C., and a wash in 0.5xc3x97 to 1xc3x97SSC at 55 to 60xc2x0 C. Exemplary high stringency conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37xc2x0 C., and a wash in 0.1xc3x97SSC at 60 to 65xc2x0 C. The duration of hybridization is generally less than about 24 hours, usually about 4 to about 12 hours.
Specificity is typically the function of post-hybridization washes, the critical factors being the ionic strength and temperature of the final wash solution. For DNA-DNA hybrids, the Tm can be approximated from the equation of Meinkoth and Wahl (1984) Anal. Biochem. 138:267-284: Tm=81.5xc2x0 C.+16.6 (log M)+0.41 (%GC)xe2x88x920.61 (% form)xe2x88x92500/L; where M is the molarity of monovalent cations, %GC is the percentage of guanosine and cytosine nucleotides in the DNA, % form is the percentage of formamide in the hybridization solution, and L is the length of the hybrid in base pairs. The Tm is the temperature (under defined ionic strength and pH) at which 50% of a complementary target sequence hybridizes to a perfectly matched probe. Tm is reduced by about 1xc2x0 C. for each 1% of mismatching; thus, Tm, hybridization, and/or wash conditions can be adjusted to hybridize to sequences of the desired identity. For example, if sequences with xe2x89xa790% identity are sought, the Tm can be decreased 10xc2x0 C. Generally, stringent conditions are selected to be about 5xc2x0 C. lower than the thermal melting point (Tm) for the specific sequence and its complement at a defined ionic strength and pH. However, severely stringent conditions can utilize a hybridization and/or wash at 1, 2, 3, or 4xc2x0 C. lower than the thermal melting point (Tm); moderately stringent conditions can utilize a hybridization and/or wash at 6, 7, 8, 9, or 10xc2x0 C. lower than the thermal melting point (Tm); low stringency conditions can utilize a hybridization and/or wash at 11, 12, 13, 14, 15, or 20xc2x0 C. lower than the thermal melting point (Tm). Using the equation, hybridization and wash compositions, and desired Tm, those of ordinary skill will understand that variations in the stringency of hybridization and/or wash solutions are inherently described. If the desired degree of mismatching results in a Tm of less than 45xc2x0 C. (aqueous solution) or 32xc2x0 C. (formamide solution), it is preferred to increase the SSC concentration so that a higher temperature can be used. An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biologyxe2x80x94Hybridization with Nucleic Acid Probes, Part I, Chapter 2 (Elsevier, N.Y.); and Ausubel et al., eds. (1995) Current Protocols in Molecular Biology, Chapter 2 (Greene Publishing and Wiley-Interscience, New York). See Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.).
Thus, isolated sequences that encode for a PHA synthase gene and which hybridize under stringent conditions to the PHA synthase sequences disclosed herein, or to fragments thereof, are encompassed by the present invention. Such sequences will be at least about 75% to 80% homologous, about 80% or 90% homologous, and even at least about 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous with the disclosed sequences. That is, the sequence identity of sequences may range, sharing at least 75% to 80%, about 85% to 90%, and even at least about 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more sequence identity.
The following terms are used to describe the sequence relationships between two or more nucleic acids or polynucleotides: (a) xe2x80x9creference sequencexe2x80x9d, (b) xe2x80x9ccomparison windowxe2x80x9d, (c) xe2x80x9csequence identityxe2x80x9d, (d) xe2x80x9cpercentage of sequence identityxe2x80x9d, and (e) xe2x80x9csubstantial identityxe2x80x9d.
(a) As used herein, xe2x80x9creference sequencexe2x80x9d is a defined sequence used as a basis for sequence comparison. A reference sequence may be a subset or the entirety of a specified sequence; for example, as a segment of a full-length cDNA or gene sequence, or the complete cDNA or gene sequence.
(b) As used herein, xe2x80x9ccomparison windowxe2x80x9d makes reference to a contiguous and specified segment of a polynucleotide sequence, wherein the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. Generally, the comparison window is at least 20 contiguous nucleotides in length, and optionally can be 30, 40, 50, 100, or longer. Those of skill in the art understand that to avoid a high similarity to a reference sequence due to inclusion of gaps in the polynucleotide sequence a gap penalty is typically introduced and is subtracted from the number of matches.
Methods of alignment of sequences for comparison are well known in the art. Thus, the determination of percent identity between any two sequences can be accomplished using a mathematical algorithm. Non-limiting examples of such mathematical algorithms are the algorithm of Myers and Miller (1988) CABIOS 4:11-17; the local homology algorithm of Smith et al. (1981) Adv. Appl. Math. 2:482; the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 48:443-453; the search-for-similarity-method of Pearson and Lipman (1988) Proc. Natl. Acad. Sci. 85:2444-2448; the algorithm of Karlin and Altschul (1990) Proc. Natl. Acad. Sci. USA 872264, modified as in Karlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5877.
Computer implementations of these mathematical algorithms can be utilized for comparison of sequences to determine sequence identity. Such implementations include, but are not limited to: CLUSTAL in the PC/Gene program (available from Intelligenetics, Mountain View, Calif.); the ALIGN program (Version 2.0) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Version 8 (available from Genetics Computer Group (GCG), 575 Science Drive, Madison, Wis. USA). Alignments using these programs can be performed using the default parameters. The CLUSTAL program is well described by Higgins et al. (1988) Gene 73:237-244 (1988); Higgins et al. (1989) CABIOS 5:151-153; Corpet et al. (1988) Nucleic Acids Res. 16:10881-90; Huang et al. (1992) CABIOS 8:155-65; and Pearson et al. (1994) Meth. Mol. Biol. 24:307-331. The ALIGN program is based on the algorithm of Myers and Miller (1988) supra. A PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used with the ALIGN program when comparing amino acid sequences. The BLAST programs of Altschul et al (1990) J. Mol. Biol. 215:403 are based on the algorithm of Karlin and Altschul (1990) supra. BLAST nucleotide searches can be performed with the BLASTN program, score=100, wordlength=12, to obtain nucleotide sequences homologous to a nucleotide sequence encoding a protein of the invention. BLAST protein searches can be performed with the BLASTX program, score=50, wordlength=3, to obtain amino acid sequences homologous to a protein or polypeptide of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST (in BLAST 2.0) can be utilized as described in Altschul et al. (1997) Nucleic Acids Res. 25:3389. Alternatively, PSI-BLAST (in BLAST 2.0) can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al. (1997) supra. When utilizing BLAST, Gapped BLAST, PSI-BLAST, the default parameters of the respective programs (e.g., BLASTN for nucleotide sequences, BLASTX for proteins) can be used. See http://www.ncbi.hlm.nih.gov. Alignment may also be performed manually by inspection.
Unless otherwise stated, sequence identity/similarity values provided herein refer to the value obtained for an entire sequence of the invention using GAP Version 10 using the following parameters: % identity using GAP Weight of 50 and Length Weight of 3; % similarity using Gap Weight of 12 and Length Weight of 4, or any equivalent program. By xe2x80x9cequivalent programxe2x80x9d is intended any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide or amino acid residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by the preferred program.
GAP uses the algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 48: 443-453, to find the alignment of two complete sequences that maximizes the number of matches and minimizes the number of gaps. GAP considers all possible alignments and gap positions and creates the alignment with the largest number of matched bases and the fewest gaps. It allows for the provision of a gap creation penalty and a gap extension penalty in units of matched bases. GAP must make profit of gap creation penalty number of matches for each gap it inserts. If a gap extension penalty greater than zero is chosen, GAP must, in addition, make a profit for each gap inserted of the length of the gap times the gap extension penalty. Default gap creation penalty values and gap extension penalty values in Version 10 of the Wisconsin Genetics Software Package for protein sequences are 8 and 2, respectively. For nucleotide sequences the default gap creation penalty is 50 while the default gap extension penalty is 3. The gap creation and gap extension penalties can be expressed as an integer selected from the group of integers consisting of from 0 to 200. Thus, for example, the gap creation and gap extension penalties can be 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65 or greater.
GAP presents one member of the family of best alignments. There may be many members of this family, but no other member has a better quality. GAP displays four figures of merit for alignments: Quality, Ratio, Identity, and Similarity. The Quality is the metric maximized in order to align the sequences. Ratio is the quality divided by the number of bases in the shorter segment. Percent Identity is the percent of the symbols that actually match. Percent Similarity is the percent of the symbols that are similar. Symbols that are across from gaps are ignored. A similarity is scored when the scoring matrix value for a pair of symbols is greater than or equal to 0.50, the similarity threshold. The scoring matrix used in Version 10 of the Wisconsin Genetics Software Package is BLOSUM62 (see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915).
By xe2x80x9cequivalent programxe2x80x9d is intended any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide or amino acid residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by the preferred program.
(c) As used herein, xe2x80x9csequence identityxe2x80x9d or xe2x80x9cidentityxe2x80x9d in the context of two nucleic acid or polypeptide sequences makes reference to the residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window. When percentage of sequence identity is used in reference to proteins it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule. When sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Sequences that differ by such conservative substitutions are said to have xe2x80x9csequence similarityxe2x80x9d or xe2x80x9csimilarityxe2x80x9d. Means for making this adjustment are well known to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1. The scoring of conservative substitutions is calculated, e.g., as implemented in the program PC/GENE (Intelligenetics, Mountain View, Calif.).
(d) As used herein, xe2x80x9cpercentage of sequence identityxe2x80x9d means the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison, and multiplying the result by 100 to yield the percentage of sequence identity.
(e)(i) The term xe2x80x9csubstantial identityxe2x80x9d of polynucleotide sequences means that a polynucleotide comprises a sequence that has at least 70% sequence identity, preferably at least 80%, more preferably at least 90%, and most preferably at least 95%, compared to a reference sequence using one of the alignment programs described using standard parameters. One of skill in the art will recognize that these values can be appropriately adjusted to determine corresponding identity of proteins encoded by two nucleotide sequences by taking into account codon degeneracy, amino acid similarity, reading frame positioning, and the like. Substantial identity of amino acid sequences for these purposes normally means sequence identity of at least 60%, more preferably at least 70%, 80%, 90%, and most preferably at least 95%.
Another indication that nucleotide sequences are substantially identical is if two molecules hybridize to each other under stringent conditions. Generally, stringent conditions are selected to be about 5xc2x0 C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. However, stringent conditions encompass temperatures in the range of about 1xc2x0 C. to about 20xc2x0 C., depending upon the desired degree of stringency as otherwise qualified herein. Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the polypeptides they encode are substantially identical. This may occur, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code. One indication that two nucleic acid sequences are substantially identical is when the polypeptide encoded by the first nucleic acid is immunologically cross reactive with the polypeptide encoded by the second nucleic acid.
(e)(ii) The term xe2x80x9csubstantial identityxe2x80x9d in the context of a peptide indicates that a peptide comprises a sequence with at least 70% sequence identity to a reference sequence, preferably 80%, more preferably 85%, most preferably at least 90% or 95% sequence identity to the reference sequence over a specified comparison window. Preferably, optimal alignment is conducted using the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 48:443-453. An indication that two peptide sequences are substantially identical is that one peptide is immunologically reactive with antibodies raised against the second peptide. Thus, a peptide is substantially identical to a second peptide, for example, where the two peptides differ only by a conservative substitution. Peptides that are xe2x80x9csubstantially similarxe2x80x9d share sequences as noted above except that residue positions that are not identical may differ by conservative amino acid changes.
The use of the term xe2x80x9cDNA constructsxe2x80x9d herein is not intended to limit the present invention to nucleotide constructs comprising DNA. Those of ordinary skill in the art will recognize that nucleotide constructs, particularly polynucleotides and oligonucleotides, comprised of ribonucleotides and combinations of ribonucleotides and deoxyribonucleotides may also be employed in the methods disclosed herein. Thus, the DNA constructs of the present invention encompass all nucleotide constructs that can be employed in the methods of the present invention including, but not limited to, those comprised of deoxyribonucleotides, ribonucleotides, and combinations thereof. Such deoxyribonucleotides and ribonucleotides include both naturally occurring molecules and synthetic analogues. The nucleotide constructs of the invention also encompass all forms of nucleotide constructs including, but not limited to, single-stranded forms, double-stranded forms, hairpins, stem-and-loop structures, and the like.
The PHA synthase sequences of the invention are provided in expression cassettes for expression in the plant of interest. The cassette will include 5xe2x80x2 and 3xe2x80x2 regulatory sequences operably linked to a PHA synthase sequence of the invention. By xe2x80x9coperably linkedxe2x80x9d is intended a functional linkage between a promoter and a second sequence, wherein the promoter sequence initiates and mediates transcription of the DNA sequence corresponding to the second sequence. Generally, operably linked means that the nucleic acid sequences being linked are contiguous and, where necessary to join two protein coding regions, contiguous and in the same reading frame. In the case of protein coding sequences, xe2x80x9coperably linkedxe2x80x9d includes joining two protein coding sequences in such a manner that both sequences are in the same reading frame for translation. For example, a nucleotide sequence encoding a peroxisome-targeting signal may be joined to the 3xe2x80x2 end of a coding sequence of a protein of the invention in such manner that both sequences are in the same reading frame for translation to yield a the protein of the invention with a C-terminal addition of the peroxisome-targeting signal.
The cassette may additionally contain at least one additional gene to be cotransformed into the organism. Alternatively, the additional gene(s) can be provided on multiple expression cassettes.
Such an expression cassette is provided with a plurality of restriction sites for insertion of a PHA synthase sequence to be under the transcriptional regulation of the regulatory regions. The expression cassette may additionally contain selectable marker genes.
The expression cassette will include in the 5xe2x80x2-3xe2x80x2 direction of transcription, a transcriptional and translational initiation region, a DNA sequence of the invention, and a transcriptional and translational termination region functional in plants. The transcriptional initiation region, the promoter, may be native or analogous or foreign or heterologous to the plant host. Additionally, the promoter may be the natural sequence or alternatively a synthetic sequence. By xe2x80x9cforeignxe2x80x9d is intended that the transcriptional initiation region is not found in the native plant into which the transcriptional initiation region is introduced. As used herein, a chimeric gene comprises a coding sequence operably linked to a transcription initiation region that is heterologous to the coding sequence.
The termination region may be native with the transcriptional initiation region, may be native with the operably linked DNA sequence of interest, or may be derived from another source. Convenient termination regions are available from the Ti-plasmid of A. tumefaciens, such as the octopine synthase and nopaline synthase termination regions. See also Guerineau et al. (1991) Mol. Gen. Genet. 262:141-144; Proudfoot (1991) Cell 64:671-674; Sanfacon et al. (1991) Genes Dev. 5:141-149; Mogen et al. (1990) Plant Cell 2:1261-1272; Munroe et al. (1990) Gene 91:151-158; Ballas et al. (1989) Nucleic Acids Res. 17:7891-7903; and Joshi et al. (1987) Nucleic Acid Res. 15:9627-9639.
Where appropriate, the gene(s) may be optimized for increased expression in the transformed plant. That is, the genes can be synthesized using plant-preferred codons for improved expression. See, for example, Campbell and Gowri (990) Plant Physiol. 92:1-11 for a discussion of host-preferred codon usage. Methods are available in the art for synthesizing plant-preferred genes. See, for example, U.S. Pat. Nos. 5,380,831, and 5,436,391, and Murray et al. (1989) Nucleic Acids Res. 17:477-498, herein incorporated by reference.
Additional sequence modifications are known to enhance gene expression in a cellular host. These include elimination of sequences encoding spurious polyadenylation signals, exon-intron splice site signals, transposon-like repeats, and other such well-characterized sequences that may be deleterious to gene expression. The G-C content of the sequence may be adjusted to levels average for a given cellular host, as calculated by reference to known genes expressed in the host cell. When possible, the sequence is modified to avoid predicted hairpin secondary mRNA structures.
The expression cassettes may additionally contain 5xe2x80x2 leader sequences in the expression cassette construct. Such leader sequences can act to enhance translation. Translation leaders are known in the art and include: picomavirus leaders, for example, EMCV leader (Encephalomyocarditis 5xe2x80x2 noncoding region) (Elroy-Stein et al. (1989) Proc. Natl. Acad. Sci. USA 86:6126-6130); potyvirus leaders, for example, TEV leader (Tobacco Etch Virus) (Gallie et al. (1995) Gene 165(2):233-238), MDMV leader (Maize Dwarf Mosaic Virus) (Virology 154:9-20), and human immunoglobulin heavy-chain binding protein (BiP) (Macejak et al. (1991) Nature 353:90-94); untranslated leader from the coat protein mRNA of alfalfa mosaic virus (AMV RNA 4) (Jobling et al. (1987) Nature 325:622-625); tobacco mosaic virus leader (TMV) (Gallie et al. (1989) in Molecular Biology of RNA, ed. Cech (Liss, New York), pp. 237-256); and maize chlorotic mottle virus leader (MCMV) (Lommel et al. (1991) Virology 81:382-385). See also, Della-Cioppa et al. (1987) Plant Physiol. 84:965-968. Other methods known to enhance translation can also be utilized, for example, introns, and the like.
In preparing the expression cassette, the various DNA fragments may be manipulated, so as to provide for the DNA sequences in the proper orientation and, as appropriate, in the proper reading frame. Toward this end, adapters or linkers may be employed to join the DNA fragments or other manipulations may be involved to provide for convenient restriction sites, removal of superfluous DNA, removal of restriction sites, or the like. For this purpose, in vitro mutagenesis, primer repair, restriction, annealing, resubstitutions, e.g., transitions and transversions, may be involved.
A number of promoters can be used in the practice of the invention. The promoters may be selected based on the desired timing, localization and level of expression genes encoding enzymes in a plant. Constitutive, seed-preferred, germination-preferred, tissue-preferred and chemical-regulatable promoters can be used in the practice of the invention. Such constitutive promoters include, for example, the core promoter of the Rsyn7 promoter and other constitutive promoters disclosed in WO 99/43838 and U.S. Pat. No. 6,072,050; the core CaMV 35S promoter (Odell et al. (1985) Nature 313:810-812); rice actin (McElroy et al. (1990) Plant Cell 2:163-171); ubiquitin (Christensen et al. (1989) Plant Mol. Biol. 12:619-632 and Christensen et al. (1992) Plant Mol. Biol. 18:675-689); pEMU (Last et al. (1991) Theor. Appl. Genet. 81:581-588); MAS (Velten et al. (1984) EMBO J. 3:2723-2730); ALS promoter (U.S. Pat. No. 5,659,026), and the like. Other constitutive promoters include, for example, U.S. Pat. Nos. 5,608,149; 5,608,144; 5,604,121; 5,569,597; 5,466,785; 5,399,680; 5,268,463; and 5,608,142.
The methods of the invention are useful for producing PHA in seeds. Toward this end, the coding sequences for the enzymes of the invention may be utilized in expression cassettes or DNA constructs with seed-preferred promoters, seed-development promoters (those promoters active during seed development), as well as seed-germination promoters (those promoters active during seed germination). Such seed-preferred promoters include, but are not limited to, Cim1 (cytokinin-induced message); cZ19B1 (maize 19 kDa zein); and celA (cellulose synthase) (see the copending application entitled xe2x80x9cSeed-Preferred Promoters,xe2x80x9d U.S. application Ser. No. 09/377,648, filed Aug. 19, 1999, herein incorporated by reference). For dicots, particular promoters include those from the following genes: phaseolin, napin, xcex2-conglycinin, soybean lectin, and the like. For monocots, particular promoters include those from the following genes: maize 15Kd zein, 22KD zein, 27kD zein, waxy, shrunken 1, shrunken 2, and globulin 1.
For tissue-preferred expression, the coding sequences of the invention can be operably linked to tissue-preferred promoters. For example, leaf-preferred promoters may be utilized if expression in leaves is desired. Leaf-preferred promoters are known in the art. See, for example, Yamamoto et al. (1997) Plant J. 12(2):255-265; Kwon et al. (1994) Plant Physiol. 105:357-67; Yamamoto et al. (1994) Plant Cell Physiol. 35(5):773-30 778; Gotor et al. (1993) Plant J. 3:509-18; Orozco et al. (1993) Plant Mol. Biol. 23(6):1129-1138; and Matsuoka et al. (1993) Proc. Natl. Acad. Sci. USA 90(20):9586-9590.
Other tissue-preferred promoters include, for example, Kawamata et al. (1997) Plant Cell Physiol. 38(7):792-803; Hansen et al. (1997) Mol Gen Genet. 254(3):337-343; Russell et al. (1997) Transgenic Res. 6(2):157-168; Rinehart et al. (1996) Plant Physiol. 112(3):1331-1341; Van Camp et al. (1996) Plant Physiol. 112(2):525-535; Canevascini et al. (1996) Plant Physiol. 112(2):513-524; Lam (1994) Results Probl Cell Differ. 20:181-196; Orozco et al. (1993) Plant Mol Biol. 23(6):1129-1138; and Guevara-Garcia et al. (1993) Plant J. 4(3):495-505.
In the practice of the invention, it may be desirable to use chemical-regulatable promoters to control the expression of gene in a plant. Depending upon the objective, the promoter may be a chemical-inducible promoter, where application of the chemical induces gene expression, or a chemical-repressible promoter, where application of the chemical represses gene expression. Chemical-inducible promoters are known in the art and include, but are not limited to, the maize In2-2 promoter, which is activated by benzenesulfonamide herbicide safeners, the maize GST promoter, which is activated by; hydrophobic electrophilic compounds that are used as pre-emergent herbicides, and the tobacco PR-1a promoter, which is activated by salicylic acid. Other chemical-regulatable promoters of interest include steroid-responsive promoters (see, for example, the glucocorticoid-inducible promoter in Schena et al. (1991) Proc. Natl. Acad. Sci. USA 88:10421-10425 and McNellis et al. (1998) Plant J. 14(2):247-257) and tetracycline-inducible and tetracycline-repressible promoters (see, for example, Gatz et al. (1991) Mol. Gen. Genet. 227:229-237, and U.S. Pat. Nos. 5,814,618 and 5,789,156), herein incorporated by reference.
In embodiments of the invention, it may be necessary to direct a PHA synthase to the peroxisomes of a plant. Thus, the expression cassette may additionally comprise a nucleotide sequence encoding a peroxisome-targeting signal. Methods for directing an enzyme to the peroxisome are well known in the art. Typically, such methods involve operably linking a nucleotide sequence encoding a peroxisome-targeting signal to the coding sequence of a protein or modifying the coding sequence to additionally encode the peroxisome-targeting signal without substantially affecting the intended function of the encoded protein. See, for example, Olsen et al. (1993) Plant Cell 5:941-952, Mullen et al. ( 1997) Plant Physiol. 115:881-889, Gould et al. (1990) EMBO J. 9:85-90, Flynn et al. (1998) Plant J. 16:709-720; Preisig-Muller and Kindl (1993) Plant Mol. Biol. 22:59-66 and Kato et al. (1996) Plant Cell 8:1601-1611; herein incorporated by reference.
It is recognized that a PHA synthase of the invention may be directed to the peroxisome by operably linking a peroxisome-targeting signal to the C-terminus or the N-terminus of the enzyme. It is further recognized that an enzyme which is synthesized with a peroxisome-targeting signal may be processed proteolytically in vivo resulting in the removal of the peroxisome-targeting signal from the amino acid sequence of the mature, peroxisome-localized enzyme.
It is further recognized that the components of the expression cassette may be modified to increase expression. For example, truncated sequences, nucleotide substitutions or other modifications may be employed. See, for example Perlak et al.(1991) Proc. Natl. Acad. Sci. USA 88:3324-3328; Murray et al. (1989) Nucleic Acid Research 17:477-498; and WO 91/16432.
Transformation protocols as well as protocols for introducing nucleotide sequences into plants may vary depending on the type of plant or plant cell, i.e., monocot or dicot, targeted for transformation. Suitable methods of introducing nucleotide sequences into plant cells and subsequent insertion into the plant genome include microinjection (Crossway et al. (1986) Biotechniques 4:320-334), electroporation (Riggs et al. (1986) Proc. Natl. Acad. Sci. USA 83:5602-5606, Agrobacterium-mediated transformation (Townsend et al., U.S. Pat. No. 5,563,055; Zhao et al., U.S. Pat. No. 5,981,840), direct gene transfer (Paszkowski et al. (1984) EMBO J. 3:2717-2722), and ballistic particle acceleration (see, for example, Sanford et al., U.S. Pat. No. 4,945,050; Tomes et al., U.S. Pat. No. 5,879,918; Tomes et al., U.S. Pat. No. 5,886,244; Bidney et al., U.S. Pat. No. 5,932,782; Tomes et al. (1995) xe2x80x9cDirect DNA Transfer into Intact Plant Cells via Microprojectile Bombardment,xe2x80x9d in Plant Cell, Tissue, and Organ Culture: Fundamental Methods, ed. Gamborg and Phillips (Springer-Verlag, Berlin); and McCabe et al. (1988) Biotechnology 6:923-926). Also see Weissinger et al. (1988) Ann. Rev. Genet. 22:421-477; Sanford et al. (1987) Particulate Science and Technology 5:27-37 (onion); Christou et al. (1988) Plant Physiol. 87:671-674 (soybean); McCabe et al. (1988) Bio/Technology 6:923-926 (soybean); Finer and McMullen (1991) In Vitro Cell Dev. Biol. 27P:175-182 (soybean); Singh et al. (1998) Theor. Appl. Genet. 96:319-324 (soybean); Datta et al. (1990) Biotechnology 8:736-740 (rice); Klein et al. (1988) Proc. Natl. Acad. Sci. USA 85:4305-4309 (maize); Klein et al. (1988) Biotechnology 6:559-563 (maize); Tomes, U.S. Pat. No. 5,240,855; Buising et al., U.S. Pat. Nos. 5,322,783 and 5,324,646; Tomes et al. (1995) xe2x80x9cDirect DNA Transfer into Intact Plant Cells via Microprojectile Bombardment,xe2x80x9d in Plant Cell, Tissue, and Organ Culture: Fundamental Methods, ed. Gamborg (Springer-Verlag, Berlin) (maize); Klein et al. (1988) Plant Physiol. 91:440-444 (maize); Fromm et al. (1990) Biotechnology 8:833-839 (maize); Hooykaas-Van Slogteren et al. (1984) Nature (London) 311:763-764; Bowen et al., U.S. Pat. No. 5,736,369 (cereals); Bytebier et al. (1987) Proc. Natl. Acad. Sci. USA 84:5345-5349 (Liliaceae); De Wet et al. (1985) in The Experimental Manipulation of Ovule Tissues, ed. Chapman et al. (Longman, New York), pp. 197-209 (pollen); Kaeppler et al. (1990) Plant Cell Reports 9:415-418 and Kaeppler et al. (1992) Theor. Appl. Genet. 84:560-566 (whisker-mediated transformation); D""Halluin et al. (1992) Plant Cell 4:1495-1505 (electroporation); Li et al. (1993) Plant Cell Reports 12:250-255 and Christou and Ford (1995) Annals of Botany 75:407-413 (rice); Osjoda et al. (1996) Nature Biotechnology 14:745-750 (maize via Agrobacterium tumefaciens); all of which are herein incorporated by reference.
The cells that have been transformed may be grown into plants in accordance with conventional ways. See, for example, McCormick et al. (1986) Plant Cell Reports 5:81-84. These plants may then be grown, and either pollinated with the same transformed strain or different strains, and the resulting hybrid having constitutive expression of the desired phenotypic characteristic identified. Two or more generations may be grown to ensure that expression of the desired phenotypic characteristic is stably maintained and inherited and then seeds harvested to ensure expression of the desired phenotypic characteristic has been achieved.
The invention involves plants genetically manipulated to produce PHA are utilized. By xe2x80x9cgenetically manipulatedxe2x80x9d is intended modifying the genome of an organism, preferably a plant, including cells and tissue thereof, by any means known to those skilled in the art. Modifications to a genome include both losses and additions of genetic material as well as any sorts of rearrangements in the organization of the genome. Such modifications can be accomplished by, for example, transforming a plant""s genome with a DNA construct containing nucleotide sequences which are native to the recipient plant, non-native or a combination of both, conducting a directed sexual mating or cross pollination within a single species or between related species, fusing or transferring nuclei, inducing mutagenesis and the like.
In the practice of certain embodiments of the present invention, a plant is genetically manipulated to produce more than one heterologous enzyme involved in PHA synthesis. Those of ordinary skill in the art realize that this can be accomplished in any one of a number of ways. For example, each of the respective coding sequences for such enzymes can be operably linked to a promoter and then joined together in a single continuous fragment of DNA comprising a multigenic expression cassette. Such a multigenic expression cassette can be used to transform a plant to produce the desired outcome. Alternatively, separate plants can be transformed with expression cassettes containing one or a subset of the desired set of coding sequences. Transformed plants that express the desired activity can be selected by standard methods available in the art such as, for example, assaying enzyme activities, immunoblotting using antibodies which bind to the enzymes of interest, assaying for the products of a reporter or marker gene, and the like. Then, all of the desired coding sequences can be brought together into a single plant through one or more rounds of cross pollination utilizing the previously selected transformed plants as parents.
Methods for cross pollinating plants are well known to those skilled in the art, and are generally accomplished by allowing the pollen of one plant, the pollen donor, to pollinate a flower of a second plant, the pollen recipient, and then allowing the fertilized eggs in the pollinated flower to mature into seeds. Progeny containing the entire complement of heterologous coding sequences of the two parental plants can be selected from all of the progeny by standard methods available in the art as described supra for selecting transformed plants. If necessary, the selected progeny can be used as either the pollen donor or pollen recipient in a subsequent cross pollination.
The present invention may be used for transformation of any plant species, including, but not limited to, monocots and dicots. Examples of plants of interest include, but are not limited to, corn (Zea mays), Brassica sp. (e.g., B. napus, B. rapa, B. juncea), particularly those Brassica species useful as sources of seed oil, alfalfa (Medicago sativa), rice (Oryza sativa), rye (Secale cereale), sorghum (Sorghum bicolor, Sorghum vulgare), millet (e.g., pearl millet (Pennisetum glaucum), proso millet (Panicum miliaceum), foxtail millet (Setaria italica), finger millet (Eleusine coracana)), sunflower (Helianthus annuus), safflower (Carthamus tinctorius), wheat (Triticum aestivum), soybean (Glycine max), tobacco (Nicotiana tabacum), potato (Solanum tuberosum), peanuts (Arachis hypogaea), cotton (Gossypium barbadense, Gossypium hirsutum), sweet potato (Ipomoea batatus), cassava (Manihot esculenta), coffee (Coffea spp.), coconut (Cocos nucifera), pineapple (Ananas comosus), citrus trees (Citrus spp.), cocoa (Thedbroma cacao), tea (Camellia sinensis), banana (Musa spp.), avocado (Persea americana), fig (Ficus casica), guava (Psidium guajava), mango (Mangifera indica), olive (Olea europaea), papaya (Carica papaya), cashew (Anacardium occidentale), macadamia (Macadamia integrifolia), almond (Prunus amygdalus), sugar beets (Beta vulgaris), sugarcane (Saccharum spp.), oats, barley, vegetables, omamentals, and conifers. Preferred plants are oilseed plants which include, but are not limited to, corn, Brassica sp., sunflower, safflower, soybean, peanut, cotton, flax, coconut and oil palm.
Additionally, the PHA synthases nucleotide sequences of the invention can be used in methods for producing PHA in host organisms other than plants, including but not limited to bacteria, yeasts and other fungi. Useful host organisms for PHA production include Actinomycetes (e.g., Streptomyces sp. and Nocardia sp.); bacteria (e.g., Alcaligenes (e.g., A. eutrophus), Bacillus cereus, B. subtilis, B. licheniformis, B. megaterium, Escherichia coli, Klebsiella (e.g., K. aerogenes and K. oxytoca), Lactobacillus, Methylomonas, Pseudomonas (e.g., P. putida and P. fluorescens); fungi (e.g., Aspergillus, Cephalosporium, and Penicillium); and yeast (e.g., Saccharomyces, Rhodotorula, Candida, Hansenula, and Pichia).