Soybean seeds contain from 35% to 55% protein on a dry weight basis. The majority of this protein is storage protein, which is hydrolyzed during germination to provide energy and metabolic intermediates needed by the developing seedling. The soybean seed's storage protein is an important nutritional source when harvested and utilized as a livestock feed. In addition, it is now generally recognized that soybeans are the most economical source of protein for human consumption. Soy protein or protein isolates are already used extensively for food products in different parts of the world. Much effort has been devoted to improving the quantity and quality of the storage protein in soybean seeds.
The seeds of most plant species contain what are known in the art as seed storage proteins. These have been classified on the basis of their size and solubility (Higgins, T. J. (1984) Ann. Rev. Plant Physiol. 35:191-221). While not every class is found in every species, the seeds of most plant species contain proteins from more than one class. Proteins within a particular solubility or size class are generally more structurally related to members of the same class in other species than to members of a different class within the same species. In many species, the seed proteins of a given class are often encoded by multigene families, sometimes of such complexity that the families can be divided into subclasses based on sequence homology.
There are two major soybean seed storage proteins:glycinin (also known as the 11S globulins) and β-conglycinin (also known as the 7S globulins). Together, they comprise 70 to 80% of the seed's total protein, or 25 to 35% of the seed's dry weight. Glycinin is a large protein with a molecular weight of about 360 kDa. It is a hexamer composed of the various combinations of five major isoforms (commonly called subunits) identified as G1, G2, G3, G4 and G5. Each subunit is in turn composed of one acidic and one basic polypeptide held together by a disulfide bond. Both the acidic and basic polypeptides of a single subunit are coded for by a single gene. Hence, there are five non-allelic genes that code for the five glycinin subunits. These genes are designated Gy1, Gy2, Gy3, Gy4 and Gy5, corresponding to subunits G1, G2, G3, G4 and G5, respectively (Nielsen, N. C. et al. (1989) Plant Cell 1:313-328).
Genomic clones and cDNA's for glycinin subunit genes have been sequenced and fall into two groups based on nucleotide and amino acid sequence similarity. Group I consists of Gy1, Gy2, and Gy3, whereas Group II consists of Gy4 and Gy5. There is greater than 85% similarity between genes within a group (i.e., at least 85% of the nucleotides of Gy1, Gy2 and Gy3 are identical, and at least 85% of the nucleotides of Gy4 and Gy5 are identical), but only 42% to 46% similarity between the genes of Group I and Group II.
β-Conglycinin (a 7S globulin) is a heterogeneous glycoprotein with a molecular weight ranging from 150 and 240 kDa. It is composed of varying combinations of three highly negatively charged subunits identified as α, α′ and β. cDNA clones representing the coding regions of the genes encoding the the α and α′ subunits have been sequenced and are of similar size; sequence identity is limited to 85%. The sequence of the cDNA representing the coding region of the β subunit, however, is nearly 0.5 kb smaller than the α and α′ cDNAs. Excluding this deletion, sequence identity to the α and α′ subunits is 75-80%. The three classes of β-conglycinin subunits are encoded by a total of 15 subunit genes clustered in several regions within the genome soybean (Harada, J. J. et al. (1989) Plant Cell 1:415-425).
New soy based products such as protein concentrates, isolates, and textured protein products are increasingly utilized in countries that do not necessarily accept traditional oriental soy based foods. Use of these new products in food applications, however, depends on local tastes and functional characteristic of the protein products relative to recipe requirements. Over the past 10 years, significant effort has been aimed at understanding the functional characteristics of soybean proteins. Examples of functional characteristics include water sorption parameters, wettability, swelling, water holding, solubility, thickening, viscosity, coagulation, gelation characteristics and emulsification properties. A large portion of this body of research has focused on study of the β-conglycinin and glycinin proteins individually, as well as how each of these proteins influences the soy protein system as a whole (Kinsella, J. E. et al. (1985) New Protein Foods 5:107-179; Morr, C. V. (1987) JAOCS 67:265-271; Peng, L. C. et al. (1984) Cereal Chem 61:480-489). Because functional properties are directly related to physiochemical properties of proteins, the structural differences of β-conglycinin and glycinin result in these two proteins having significantly different functional characteristics. Differences in thermal aggregation, emulsifying properties, and water holding capacity have been reported. In addition, gelling properties vary as well, with glycinin forming gels that have greater tensile strain, stress, and shear strength, better solvent holding capacity, and lower turbidity. However, soy protein products produced today are a blend of both glycinin and β-conglycinin and therefore have functional characteristics dependent on the blend of glycinin's and β-conglycinin's individual characteristics. For example, when glycinin is heated to 100° C., about 50% of the protein is rapidly converted into soluble aggregates. Further heating results in the enlargement of the aggregates and in their precipitation. The precipitate consists of the glycinin's basic polypeptides; the acidic polypeptides remain soluble. The presence of β-conglycinin inhibits the precipitation of the basic polypeptides by forming soluble complexes with them. Whether heat denaturation is desireable or not depends on the intended use. If one could produce soy protein products containing just one or the other storage protein, products requiring specific physical characteristics derived from particular soy proteins would become available or would be more economical to produce.
Over the past 20 years, soybean lines lacking one or more of the various storage protein subunits (null mutations) have been identified in the soybean germplasm or produced using mutational breeding techniques. Breeding efforts to combine mutational events have resulted in soybean lines whose seeds contain about half the normal amount of β-conglycinin (Takashashi, K. et al. (1994) Breeding Science 44:65-66; Kitamura, J. (1995) JARQ 29:1-8). The reduction of β-conglycinin is controlled by three independent recessive mutations. Recombining glycinin subunit null mutations have resulted in lines whose seeds have significantly reduced amounts of glycinin (Kitamura, J. (1995) JARQ 29:1-8). Again, reduction is controlled by three independent recessive mutations. Developing agronomically viable soybean varieties from the above lines, in which the seed contains only glycinin or β-conglycinin, will be time consuming and costly. Each cross will result in the independent segregation of the three mutational events. In addition, each mutational event will need to be in the homozygous state. Development of high yielding agronomically superior soybean lines will require the screening and analysis of a large number of progeny over numerous generations.
Antisense technology has been used to reduce specific storage proteins in seeds. In Brassica napus, napin (a 2S albumin) and cruciferin (an 11S globulin) are the two major storage proteins, comprising about 25% and 60% of the total seeds protein, respectively. Napin proteins are coded for by a large multi-gene family of up to 16 genes; several cDNA and genomic clones have been sequenced (Josefsson, L.-G. et al. (1987) J. Biol Chem 262:12196-12201; Schofield, S. and Crouch, M. L. (1987) J. Biol. Chem. 262:12202-12208). The genes exhibit greater than 90% sequence identity in both their coding and flanking regions. The cruciferin gene family is equally complex, comprising 3 subfamilies with a total of 8 genes (Rodin, J. et al. (1992) Plant Mol. Biol. 20:559-563). Kohno-Murase et al. ((1994) Plant Mol. Biol. 26:1115-1124) demonstrated that a napin antisense gene using the napA gene driven by the napA promoter could be used to construct transgenic plants whose seeds contained little or no napin.
The same group (Kohno-Murase et al. (1995) Theoret. Applied Genetics 91:627-631) attempted to reduce cruciferin (11S globulin) expression in Brassica napus by expressing an antisense form of a cruciferin gene (cruA, encoding an alpha 2/3 isoform) under the control of the napA promoter. In this case the results were more complex. The cruciferins are divided into three subclasses based on sequence identity (alpha 1, 2/3, and 4); the classes each have from 60-75% sequence identity with each other (Rodin, J. et al. (1992) Plant Mol. Biol. 20:559-563). Expression of the antisense gene encoding the alpha 2/3 isoform resulted in lower levels of the alpha 1 and 2/3 forms. However, there was no reduction in the expression of the alpha 4 class.
Antisense technology was used to reduce the level of the seed storage protein, glutelin, in rice. Expression of the seed specific glutelin promoter operably linked to the full length antisense glutelin coding region resulted in about a 25% reduction in glutelin protein levels (U.S. Pat. No. 5,516,668).