This invention relates to enzymes involved in very long chain fatty acid (VLCFA) synthesis, and more particularly to chimeras and mutants of nucleic acid sequences encoding fatty acid elongase 3-ketoacyl CoA synthase polypeptides.
Plant seeds accumulate primarily 16- and 18-carbon fatty acids (FA). Plants also synthesize very long chain fatty acids (VLCFA). VLCFAs are saturated or unsaturated monocarboxylic acids with an unbranched even-numbered carbon chain that is greater than 18 carbons in length. Very long chain fatty acids are key components of many biologically important compounds in animals, plants, and microorganisms. For example, in animals, the VLCFA arachidonic acid is a precursor to many prostaglandins. In plants, VLCFAs are major constituents of triacylglycerols in many seed oils, are essential precursors for cuticular wax production, and are utilized in the synthesis of glycosylceramides, a component of the plasma membrane. Important VLCFAs include arachidic acid (C20:0; i.e., a 20 carbon chain with no double bonds), behenic acid (C22:0), erucic acid (C22:1), and lignoceric acid (C24:1).
VLCFAs are not desirable in edible oils. Oilseeds of the Crucifereae (e.g., rapeseed) and a few other plants, however, accumulate C20 and C22 fatty acids. Although plant breeders have developed rapeseed lines that have low levels of VLCFAs for edible oil purposes, even lower levels would be desirable. On the other hand, vegetable oils having elevated levels of VLCFAs are desirable for certain industrial uses, including uses as lubricants, fuels and as a feedstock for plastics, pharmaceuticals and cosmetics.
The biosynthesis in plants of saturated fatty acids up to an 18-carbon chain occurs in the chloroplast. C2 units from acyl thioesters are linked sequentially, beginning with the condensation of acetyl Co-enzyme A (CoA) and malonyl-acyl carrier protein (malonyl-ACP) to form a C4 acyl fatty acid. This condensation reaction is catalyzed by a 3-ketoacyl synthase III (KASIII). The enzyme 3-ketoacyl synthase I (KASI) catalyzes the stepwise condensation of a fatty acyl moiety (C4 to C14) with C2 groups and malonyl-ACP to produce a 3-ketoacyl-ACP product that is 2 carbons longer than the original substrate (C6 to C16). The last condensation reaction in the chloroplast, converting C16 to C18, is catalyzed by 3-ketoacyl synthase II (KASII). 3-ketoacyl moieties are also referred to as xcex2-ketoacyl moieties.
Each elongation cycle involves three additional enzymatic steps in addition to the condensation reaction discussed above. Briefly, the 3-ketoacyl condensation product is reduced to 3-hydroxyacyl-ACP, dehydrated to the enoyl-ACP, and reduced to an acyl-ACP. The fully reduced fatty acyl-ACP reaction product then serves as the substrate for the next cycle of elongation.
The C18:0 saturated fatty acid (stearic acid) can be desaturated to produce a C18:1 fatty acid (oleic acid), which can be transported out of the chloroplast and converted to a C18:2 fatty acid (linoleic acid) or a C18:3 fatty acid (xcex1-linolenic acid). Stearic acid and oleic acid can also be elongated outside the chloroplast to form VLCFAs. The formation of fatty acids longer than 18 carbons depends on the activity of a fatty acid elongase complex to carry out four reactions similar to those described above for fatty acid synthesis in the chloroplast. The initial reaction is catalyzed by an elongase 3-ketoacyl CoA synthase (elongase KCS) and involves the condensation of a two carbon group from malonyl CoA with a C18:0 or C18:1 fatty acyl CoA substrate. A gene encoding an elongase KCS from Arabidopsis thaliana has been identified and designated FAE1. See, e.g., U.S. Pat. No. 6,124,524. The gene product catalyzes the condensation of oleoyl CoA and malonyl CoA, leading to the conversion of the C18 substrate to a C20:1 product, eicosenoyl CoA. Mutations have been identified in the A. thaliana FAE1 gene (see WO 96/13582). A. thaliana plants carrying a mutation in FAE1 have significant decreases in the levels of VLCFAs in seeds.
Despite 85% sequence identity at the amino acid level between the Arabidopsis thaliana FAE1 polypeptide and the Brassica napus polypeptide of GenBank Accession No. AAB72178, the composition of the oil from A. thaliana and B. napus seeds suggests that the enzymes may have different substrate specificities and/or catalytic activity. VLCFAs constitute about 22% of the seed oil of A. thaliana, whereas VLCFAs constitute about 62% of the seed oil in rape. A. thaliana seed oil is primarily eicosenoic acid (about 18%), with a small amount of erucic acid and longer-chain monunusaturated fatty acids (about 2%). In contrast, rapeseed oil has a relatively small amount of eicosenoic acid (about 10%) and relatively larger amounts of erucic acid and longer-chain monunsaturates (about 52%).
The present invention provides novel polypeptides with altered elongase KCS substrate specificity and/or catalytic activity. One such novel polypeptide comprises three polypeptide segments. The amino-terminal first polypeptide segment has membrane-anchoring properties. It is joined to a second polypeptide segment whose amino acid sequence is residues 75-114 of SEQ ID NO:12 or residues 75-114 of SEQ ID NO:14, followed by a third polypeptide segment having at least 40% sequence identity to the C-terminal 392 amino acids of SEQ ID NO:4. Examples of such polypeptides have the amino acid sequences shown in SEQ ID NOS:12 and 14. The third polypeptide segment can have an aspartic acid residue at the position corresponding to amino acid 307 of SEQ ID NO:4. Examples of such polypeptides have the amino acid sequences shown in SEQ ID NOS:20, 22, 34 and 36.
Such polypeptides can catalyze the condensation of a C18 fatty acyl substrate and malonyl CoA, leading to the synthesis of a C20 fatty acyl CoA. The fatty acid substrate can be oleic acid (C18:1), in which case the product formed is eicosenoic acid (C20:1). In some instances, the fatty acid substrate is stearic acid (C18:0) and the product formed therefrom is arachidic acid (C20:0). Such polypeptides often can further catalyze the condensation of malonyl CoA and a C20 fatty acyl substrate, leading to the synthesis of a C22 fatty acyl CoA. The substrate often is eicosenoic acid (C20:1) and the product is erucic acid (C22:1). The ratio of the C22 fatty acid product to the C20 fatty acid product (C22:1/C20:1) resulting from the activity of such polypeptides can be about 0.20 or greater, about 0.30 or greater, about 0.40 or greater, or about 0.50 or greater as measured in a yeast microsome assay.
The invention also features a polypeptide comprising in the amino-terminal to carboxy-terminal direction: a first polypeptide segment that has membrane anchoring properties, joined to a second polypeptide segment that has residues 75-114 of SEQ ID NO:2, which is in turn joined to a third polypeptide segment that has at least 90% sequence identity to residues 115-506 of SEQ ID NO:4. An example of such a polypeptide has the amino acid sequence of SEQ ID NO:8. Also featured is a polypeptide comprising in the amino-terminal to carboxy-terminal direction: a first polypeptide segment having at least 80% sequence identity to residues 1-74 of SEQ ID NO:2, joined to a second polypeptide segment having residues 76-114 of SEQ ID NO:4, joined to a third polypeptide segment having at least 40% sequence identity to residues 115-506 of SEQ ID NO:4. An example of such a polypeptide has the amino acid sequence of SEQ ID NO:10. In some embodiments of these polypeptides, the third segment has an aspartic acid at the position corresponding to amino acid 307 of said SEQ ID NO:4. Examples of such polypeptides have the amino acid sequences of SEQ ID NO:16 and SEQ ID NO:18.
A plant is also disclosed, comprising at least one exogenous nucleic acid encoding one or more of the novel polypeptides disclosed herein, as well as seeds having such nucleic acids.
Nucleic acid constructs of the invention comprise at least one regulatory element operably linked to the nucleic acid coding sequence for a novel polypeptide. Host cells containing such nucleic acid constructs are disclosed. Such host cells include bacterial cells, fungal cells, insect cells, plant cells and animal cells.
A method of altering very long chain fatty acids in an organism is disclosed. The method comprises introducing an exogenous nucleic acid into the organism. The nucleic acid encodes one or more of the polypeptides described herein. The nucleic acid is expressed in the organism to produce the polypeptide(s), and the very long chain fatty acid content of the organism is increased compared to the very long chain fatty acid content of a corresponding organism that lacks the exogenous nucleic acid or does not express the exogenous nucleic acid. Suitable organisms include fungi (e.g., yeast), plants, animals, insects and bacteria. Such organisms can produce a higher level of erucic acid than a corresponding organism that lacks or does not express the exogenous nucleic acid.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. For example, the one letter and three letter abbreviations for amino acids and the one-letter abbreviations for nucleotides are commonly understood. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods and materials are described below. In addition, the materials, methods and examples are illustrative only and not intended to be limiting. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the drawings and detailed description, and from the claims.
SEQ ID NO:1 is the nucleotide sequence of the Arabidopsis thaliana FAE1 gene (GenBank Accession No. U29142).
SEQ ID NO:2 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:1 (GenBank Accession No. AAA70154).
SEQ ID NO:3 is the nucleotide sequence of a Brassica napus fatty acid elongase KCS (GenBank Accession No. AF009563).
SEQ ID NO:4 is the amino acid sequence of the B. napus polypeptide encoded by SEQ ID NO:3 (GenBank Accession No. AAB72178).
SEQ ID NO:5 is the nucleotide sequence of a B. napus fatty acid elongase KCS (GenBank Accession No. U50771).
SEQ ID NO:6 is the amino acid sequence of the B. napus polypeptide encoded by SEQ ID NO:5 (GenBank Accession No. AAA96054).
SEQ ID NO:7 is a nucleotide sequence encoding a polypeptide designated At114.
SEQ ID NO:8 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:7.
SEQ ID NO:9 is a nucleotide sequence encoding a polypeptide designated At74.
SEQ ID NO:10 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:9.
SEQ ID NO:11 is a nucleotide sequence encoding a polypeptide designated At114 L91C K92R.
SEQ ID NO:12 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:11.
SEQ ID NO:13 is a nucleotide sequence encoding a polypeptide designated At114 K92R.
SEQ ID NO:14 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:13.
SEQ ID NO:15 is a nucleotide sequence encoding a polypeptide designated At114 G307D.
SEQ ID NO:16 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:15.
SEQ ID NO:17 is a nucleotide sequence encoding a polypeptide designated At74 G306D.
SEQ ID NO:18 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:17.
SEQ ID NO:19 is a nucleotide sequence encoding a polypeptide designated At114 L91C K92R G307D.
SEQ ID NO:20 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:19.
SEQ ID NO:21 is a nucleotide sequence encoding a polypeptide designated At114 K92R G307D.
SEQ ID NO:22 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:21.
SEQ ID NO:23 is a nucleotide sequence encoding a polypeptide designated At254.
SEQ ID NO:24 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:23.
SEQ ID NO:25 is a nucleotide sequence encoding a polypeptide designated At173.
SEQ ID NO:26 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:25.
SEQ ID NO:27 is a nucleotide sequence encoding a polypeptide designated Bn176.
SEQ ID NO:28 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:27.
SEQ ID NO:29 is a nucleotide sequence encoding a polypeptide designated At399.
SEQ ID NO:30 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:29.
SEQ ID NO:31 is a nucleotide sequence encoding a polypeptide designated Bn399.
SEQ ID NO:32 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:31.
SEQ ID NO:33 is a nucleotide sequence encoding a polypeptide designated Bn G307D.
SEQ ID NO:34 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:33.
SEQ ID NO:35 is a nucleotide sequence encoding a polypeptide designated At K92R.
SEQ ID NO:36 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:35.
SEQ ID NO:37 is a nucleotide sequence encoding a polypeptide designated At254 G307D.
SEQ ID NO:38 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:37.
SEQ ID NO:39 is a nucleotide sequence encoding a polypeptide designated At173 G307D.
SEQ ID NO:40 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:39.
SEQ ID NO:41 is a nucleotide sequence encoding a polypeptide designated Bn399 G307D.
SEQ ID NO:42 is the amino acid sequence of the polypeptide encoded by SEQ ID NO:41.
SEQ ID NO:43 is the 3xe2x80x2 chimera-specific primer used in the generation of At173.
SEQ ID NO:44 is the 5xe2x80x2 chimera-specific primer used in the generation of At173.
SEQ ID NO:45 is the 3xe2x80x2 chimera-specific primer used in the generation of At114.
SEQ ID NO:46 is the 5xe2x80x2 chimera-specific primer used in the generation of At114.
SEQ ID NO:47 is the 3xe2x80x2 chimera-specific primer used in the generation of At74.
SEQ ID NO:48 is the 5xe2x80x2 chimera-specific primer used in the generation of At74.
SEQ ID NO:49 is the 3xe2x80x2 chimera-specific primer used in the generation of At114 L91C K92R.
SEQ ID NO:50 is the 5xe2x80x2 chimera-specific primer used in the generation of At114 L91C K92R.
SEQ ID NO:51 is the 3xe2x80x2 chimera-specific primer used in the generation of At114 K92R.
SEQ ID NO:52 is the 5xe2x80x2 chimera-specific primer used in the generation of At114 K92R.
SEQ ID NO:53 is the 5xe2x80x2 universal primer used in the generation of At-Bn chimeras.
SEQ ID NO:54 is the 3xe2x80x2 universal primer used in the generation of At-Bn chimeras.
SEQ ID NO:55 is the 5xe2x80x2 universal primer used in the generation of Bn-At chimeras.
SEQ ID NO:56 is the 3xe2x80x2 universal primer used in the generation of Bn-At chimeras.