Arabinogalactan proteins (AGPs) are found in flowering plants from every taxonomic group tested. These proteoglycans are widely distributed in most higher plants, occurring in almost all tissues including leaves, stems, roots, floral parts, seeds, and in many of their secretions. These macromolecules are found predominantly in soluble form in the intercellular wall space [Clarke et al. (1975) J. Cell Sci. 19:157-167; Clarke et al. (1978) Aust. J. Plant Physiol, 5:707-722], and are also localized in cytoplasmic organelles [Anderson et al. (1977) Aust. J. Plant Physiol. 4:143-158], at the protoplast surface [Clarke et al. (1975), supra and (1978), supra; Komalavilas et al. (1991) J. Biol. Chem. 266:15956-15965; Pennell et al. (1991) Plant Cell 3:1317-1326; Kieliszewski et al. (1992) Plant Physiol. 99:538-547] and in the cell wall [Bacic et al. (1988), The Biochemistry of Plants, Preiss, J. ed., Vol. 14, pp. 297-371 Academic Press, San Diego; Knox (1990) J. Cell Science 96:557-561; Roberts (1990) Current Opinion in Cell Biology 2:920-928; Knox (1992) Protoplasma 167: 1-9; Pennell (1992) Soc. Expt. Biol. Seminar Series 48: Perspectives in Plant Cell Recognition (ed. J. A. Callow and J. R. Green) Cambridge University Press pp. 105-121; Showalter (1993) Plant Cell 5:9-23; Wycoff et al. (Ref.)].
In cell cultures, AGPs are secreted into the medium [Fincher et al. (1983), Ann. Rev. Plant Physiol. 34:47-70]. Several AGPs from culture media have been investigated, including those from ryegrass cells [Anderson et al. (1977), supra; Glesson et al. (1989) Biochem. J. 264:857-262]; tobacco cells [Akiyama et al. (1981) Phytochemistry 20:2507-2510]; blackberry cells [Cartier et al. (1987) Carbohydrate Res. 168:275-283]; sycamore cells [Aspinall et al. (1969) Can. J. Biochem. 47:1063-1070]; carrot cells [Jermyn et al. (1985) AGP News 5:4-25; Kreuger et al. (1993) Planta 189:243-248]; Rosa cell suspension culture [Komalavilas et al. (1991) J. Biol. Chem. 266:15956-15965]; gladiolus cells [Glesson et al. (1979) Biochem. J. 181:607-621]; and maize cells [Kieliszewski et al. (1992) Plant Physiol. 99:538-547].
The multi-site localization of AGPs appears to be analogous to the multi-site localization of some animal proteoglycans. As regards chemical structure, however, little similarity seems to exist between plant AGPs and animal proteoglycans.
The AGPs are a family of structurally related glycosylated molecules containing high proportions of carbohydrate and usually less than 10 percent by weight of protein [Clarke et al. (1979), supra; Fincher et al. (1983), supra], although AGPs having a protein content of about 59% are known [Fincher et al. (1983), supra; Anderson et al. (1979) Phytochem. 18:609-610]. The carbohydrate consists of polysaccharide chains having a 1,3-.beta.-D-galactopyranosyl backbone and side chains of (1,3-.beta.- or 1,6-.beta.-)D-galactopyranosyl (Gal) residues and often terminating in .beta.-D-Galp and .alpha.-L-arabinofuranosyl Araf residues [Kreuger et al. (1993) Planta 189:243-248]. Other neutral sugars and uronic acids have also been detected, although at low levels. Monosaccharides which can be present are L-rhamnopyranose, D-mannopyranose, D-xylopyranose, D-glucopyranose, D-glucuronic acid and its 4-0-methyl derivative and D-galacturonic acid and its 4-0-methyl derivative [Fincher et al. (1983), supra]. In most cases, however, Gal and Ara predominate.
The protein content is usually between two and ten percent [Fincher et al. (1983), supra]. In contrast with the polysaccharide component, relatively little is known about the structure and organization of the protein core of AGPs, except that the protein appears to have domains rich in alanine, hydroxyproline, serine, and threonine [Fincher et al. (1983), supra]. This is reflected in the amino acid sequences that were obtained for AGP peptide fragments from carrot [Jermyn et al. (1985) supra]; Italian ryegrass [Glesson et al. (1989), supra]; and Rose (Komalavilas et al. (1991), supra]. A common feature of many of these isolated peptide fragments is the dipeptide Ala-Hyp, which is directly repeated in various AGP peptide fragments. To date, the entire amino acid sequence of an intact isolated AGP is not available publicly. The high carbohydrate content of AGPs appears to cause difficulties in sequencing; attempts to chemically remove the carbohydrate moiety usually results in incomplete deglycosylation and products with variable levels of carbohydrate content. The carbohydrate-protein linkage has been identified as a .beta.-galactosyl-hydroxyproline linkage in AGPs isolated from wheat and ryegrass [Glesson et al. (1985) AGP News 5:30-36 and McNamara and Stone (1981) Lebensm.-Wiss. u-Technol. 14:182-187].
AGPs are components of Gum arabic, a gummy exudation originating from the Acacia tree and known to be produced by stress conditions such as heat, drought, and wounding [Clarke et al. (1979) Biochemistry 18:520-540]. The gum finds wide use as a flavor encapsulator in dry mix products such as puddings, desserts, cake mixes and soup mixes, and is also used to emulsify essential oils in soft drinks and to prevent sugar crystallization in confectionery products [Randall et al. (1989) Food Hydrocolloids 3:65-75]. More recently, the significance of the protein component to the overall structural and functional characteristics of gums has been realized [Vandevelde et al. (1985) Carbohydr. Polymers 5:251-273; Connolly et al. (1987) Food Hydrocolloids 1:477-480 and Connolly et al. (1988) Carbohydr. Polymers 8:23-32]. The importance of the protein-rich fraction to the emulsification properties of the gum has been demonstrated [Randall et al. (1988) Food Hydrocolloids, 2:131-140].
AGPs function in several biological processes including plant development, cell-cell adhesion, pollen-stigma recognition, water retention, and disease resistance. AGPs may serve as glues or provide nutrients for growing pollen tubes. It has been suggested [Fincher et al. (1983) supra] that AGP proteins may interact with lectins or other proteins in the extracellular spaces and may be involved in the cellular response to extracellular oligosaccharide signal molecules [Norman et al. (1990) Planta 181:365-373]. Since AGPs interact with Yariv antigens and flavonol glycosides [Jermyn (1978) J. Plant Physiol. 5:563-571], they have been thought to have lectin-like properties. The molecular structure of AGPs has been proposed [Randall et al. (1989) Food Hydrocolloids 3:65-75] to resemble a type of block copolymer wherein carbohydrate blocks are covalently linked to a central polypeptide chain, thus explaining its ability to sterically stabilize emulsions and dispersions.
Plant AGP genes are not known in the prior art and the nucleotide sequence of a plant AGP gene has not been published to date. Very recently, it was reported [Sheng et al. (1993) Abstract no. 639 in Supplement to Plant Physiol. 102, Number 1, May 1993] that a PCR strategy is being used to clone potato tuber lectin, extensins and AGP sequences from a potato tuber cDNA library. It was reported that PCR products which hybridized to a carrot extensin probe gave several putative clones which are currently under investigation. No clones corresponding to AGP genes were disclosed.
The process of obtaining an AGP clone has been found to be complex and problematic. Two of the problems associated with AGPs and their genes are (1) the very high redundancy associated with the characteristic amino acid sequence of an AGP peptide, i.e., (a) a high hydroxyproline content and (b) regions containing a high content of hydroxyproline, alanine, serine, and threonine (OAST); and (2) the GC-richness of corresponding oligonucleotides leading to problems with the specificity of hybridization. Indistinct and imprecise alignment during nucleic acid hybridization, for example, in the PCR technique, has resulted in lack of success in the ability to obtain an AGP clone. This results in the amplification of incorrect sequences when compared to the original template. Plants are also known to contain a variety of glycine-rich proteins which are also encoded by GC-rich DNA. Applicants' disclosure circumvents this problem and enables the isolation of AGP genes.
Two approaches to the isolation of the AGPs from plant extracts have been used in previous studies. One approach consists of classical fractionation of plant extracts [Fincher et al. (1974) Aust. J. Biol. Sci. 27:117-132; Aspinall (1969) Adv. Carbohydrate Chem. 24:333-379]. A convenient initial fractionation of extracts is treatment to saturation with (NH.sub.4).sub.2 SO.sub.4, which does not usually precipitate AGPs. Subsequent ion-exchange and affinity chromatography can be used to isolate the AGPs.
Another approach to the isolation of AGPs from plant extracts is precipitation with a class of dyes prepared by coupling diazotized 4-aminophenyl glycosides to phloroglucinol [Jermyn et al. (1975), supra]. These dyes were first prepared by Yariv et al. (1962) Biochem. J. 85:383-388) as precipitating antigens for antibodies to glycoside determinants, and the .beta.-glycosyl artificial carbohydrate antigen was shown to precipitate an arabinose-and-galactose-containing polymer from soya bean, jack bean and maize [Yariv et al. (1967) Biochem. J. 105:1c-2c]. Since then, this precipitation reaction has been widely used to isolate AGPs from extracts of seeds of every taxonomic group of flowering plants, as well as leaf extracts and callus-culture filtrates [Jermyn & Yeow (1975) Aust. J. Plant Physiol. 2:501-531; Anderson et al. (1977), supra; and review by Clarke et al. (1979), Phytochemistry 18:521-540].
These dyes have also been used as cytochemical reagents for the localization of AGPs in plant tissues [Clarke et al. (1975), J. Cell Sci. 19:157-167; Clarke et al. (1978), Q. Rev. Biol. 53:3-28]. The nature of the binding of AGP to the Yariv reagent is not understood, but it is likely to involve both carbohydrate and protein residues. The binding of Yariv's reagent to AGP is not affected by removal of the arabinose residues [Glesson et al. (1979), supra; Akiyama et al. (1981), supra], but is abolished by progressive acid hydrolysis of the AGP [Fincher et al. (1983), supra].
In higher plants AGPs are also classified as belonging to a group of proteins characterized by hydroxyproline-rich domains. These hydroxyproline-rich glycoproteins (HRGPs) are also characterized by carbohydrate side chains that contain arabinose and galactose. The group has been traditionally divided into three main classes: the cell wall associated extensins; the soluble arabinogalactan-proteins (AGPs), and the solanaceous lectins. The differences between these groups are summarized in Table 1.0. The most important factors in the classification of the HRGPs are: the amount, composition, and sequence of their carbohydrate component, the sequence and composition of the polypeptide backbone, the linkage between carbohydrate and protein and its localization.
A new group of proteins, the proline-rich proteins, has been described recently. The proline-rich proteins (PRPs) have also been referred to as the hydroxyproline/proline-rich proteins or the repetitive proline-rich proteins. Amino acid compositions of some PRPs [Averyhart-Fullhard et al. (1988) Proc. Natl. Acad. 85:1082-1085; Datta et al. (1989) Plant Cell 1:945-952; Kleis-San Francisco et al. (1990) Plant Physiol. 94:1897-1902] indicated equimolar amounts of proline and hydroxyproline. However, the PRPs do not appear to be glycosylated and, in this way, are distinguished from the HRGPs (hydroxyproline-rich glycoproteins).
As indicated in Table 1.0, AGPs are readily distinguished from extension and lectin HRGPs. Extensins are highly positively charged HRGPs, are rich in hydroxyproline, lysine, tyrosine, serine, and proline, possess carbohydrate side chains that are rich in arabinose, and are tightly associated with cell walls. The high lysine content of the extensins contributes to their positive charge, and the tyrosine in extensin may form intermolecular [Stafstrom and Staehelin (1986), Plant Physiol. 81:234-241] and intermolecular isodityrosine linkages that have been implicated in cross-linking extensin in vitro [Everdeen et al. (1988) Plant Physiol. 87:616-621] and in vivo [Cooper and Varner (1983) Biochem. Biophys. Res. Comm. 112:161-167; Biggs and Fry (1990) Plant Physiol. 92:197-204].
Hydroxyproline accounts for 30-50% of the amino acids in extensin and is found in short peptides that are repeated a number of times in the molecule. The core peptide that is most commonly encoded by extensin genes is Ser(Pro).sub.4 (Table 1.0), which may be post-translationally modified to Ser(Hyp).sub.4. Recently, amino acid sequences have been obtained from extensin-like molecules that do not contain the Ser(Hyp).sub.4 peptides [Kieliszewski et al. (1990) Plant Physiol. 92:316-326; Li et al. (1990) Plant Physiol. 92: 327-333].
The carbohydrate side chains of extensins consist of short arabinosides linked to hydroxyproline, and single galactose residues linked to serine. The function of the carbohydrate side chains of extensins is not clear, but there is some evidence that they stabilize the polyproline II helix, which gives extensin its characteristic rod-like shape; Stafstrom and Staehelin (1986) Plant Physiol. 81:242-246.
The solanaceous lectins are positively charged glycoproteins that are identical to the extensins in the composition and structure of their carbohydrate side chains. Two important features discriminate the solanaceous lectins from extensins; their localization in the vacuole and cytoplasm [Millar et al. (1992) Biochem. J. 283:813-821], and their relatively high cysteine content (10-12 Mol %; Showalter (1993) Plant Cell 5:9-23). The cysteine in the potato lectin is concentrated in a single domain of the molecule that contains the carbohydrate binding site, and is distinct from the domain that is rich in hydroxyproline and glycosylated [Ashford et al. (1982) Biochem. J. 201:641-645]. The different lectins are immunologically cross-reactive [Kilpatrick et al. (1980), Biochem. J. 185:269-272], and contain both carbohydrate and protein epitopes [Ashford et al. (1982) supra].
The features that distinguish the AGPs from the extensins and solanaceous lectins are listed in Table 1.0. The AGPs usually have a negative to neutral overall charge, and are soluble in aqueous buffers. A characteristic feature of the AGPs is their ability to bind .beta.-glucosyl Yariv reagent, whereas extensins and lectins do not bind the Yariv reagent.
Carbohydrate forms a major portion of the mass of AGPs [Clarke et al. (1979), supra; Fincher et al. (1983), supra]. The majority of the AGPs that have been chemically characterized contain less than 10% (w/w) protein [Clarke et al. (1979), supra; Fincher et al. (1983), supra], but the AGPs from Cannabis sativa leaves (25% [w/w] protein), rice bran (27% [w/w] protein), and sycamore suspension cultures (19-38% [w/w] protein), are notable exceptions [Clarke et al. (1979), Phytochem. 18:521-540]. The protein backbones of AGPs often contain domains that are rich in hydroxyproline, alanine, serine, and threonine.