Nitrogen-fixing root nodules of leguminous plants are formed as the result of root infection by rhizobia and subsequent development of a symbiosis between bacteria and plant. The development of the symbiosis is dependent on specific recognition between plant and bacterium, and it requires genetic information from both the plant and the bacteria.
Nodule development displays variation among legumes. Two different types of nodules are recognized, determinant and indeterminant. The nodules of soybean (Glycine max), for example, are determinant and spherical in shape. In contrast, the nodules of alfalfa (Medicago spp.), clover (Trifolium spp.), and pea (Pisum sativum) are indeterminant and elongated in shape. These nodules are also anatomically and metabolically distinct, reflecting differences in the process of nodule development which may be attributable to genetic differences between legumes as well as between the different species of Rhizobium which infect them.
In a description of nodule development, Vincent (J. M. Vincent (1980) in Nitrogen Fixation, eds. W. E. Newton and W. H. Orme-Johnson (University Park Press, Baltimore, Md., Vol. 2, pp. 103-131) distinguishes between three different stages of nodule formation: preinfection, infection and nodule formation, and nodule function. In the preinfection stage, the Rhizobium cells recognize their host plants and attach to root hairs, an event which is followed by root hair curling. In the next stage, the bacteria enter the roots via infection threads while some cortical cells dedifferentiate to form meristem. The infection threads grow toward the meristematic cells. Bacteria are released into the cytoplasm of about half of these cells, and subsequently the bacterial cells develop into bacteroids. In the final stage further differentiation of nodule cells occurs leading up to a nitrogen-fixing nodule.
Nodule-specific proteins, which are only expressed in root nodules, are likely to be associated with the infection process, nodule development, and symbiotic nitrogen fixation. Both proteins of plant origin (nodulins) and of bacteroid origin (bacteroidins) are found in nodules. Nodule-specific proteins have been identified in root preparations of soybean infected with Bradyrhizobium japonicum (R. P. Legocki and D. P. S. Verma (1980), supra) and pea (Pisum sativum) infected with Rhizobium leguminosarum PRE (T. Bisseling et al. (1983) EMBO J. 2:961-966). In each case, a nodule-specific antiserum was used to identify the nodule proteins by immunoprecipitation. Each of these antisera was produced by titration of an antiserum raised against soluble nodule proteins with a root preparation from uninfected plants. The drawbacks to these studies are that the plant or bacterial origin of the nodule-specific proteins could not be established and that the antigenicity of each protein affects the immunological analysis.
In soybean, the in vitro translation products of root nodule polysomes were analyzed with nodule-specific antiserum. Control experiments showed that bacterial RNA was not translated in the in vitro system. At least 18-20 host plant-derived polypeptides were identified having molecular weights in the range of 18-20 kd. These proteins were absent from uninfected roots, bacteroids and free-living B. japonicum (R. P. Legocki and D. P. S. Verma (1980) Cell 20:153-163). In addition, bacteroids were isolated and incubated with [35S] methionine to label bacteroid proteins. Two polypeptides cross-reacted with nodule-specific antiserum. The bacteroid excreted polypeptides had molecular weights of about 11 kd (R. C. van den Bos et al. (1978) J. Gen. Microbiol. 109:131-139). Approximately 20 nodule-specific proteins were identified in pea root protein extracts by probing Western protein blots with nodule-specific antiserum. The proteins detected ranged in molecular weight from 15 to 120 kd; however the origin of these proteins was not determined. In these experiments the in vivo nodule proteins were identified (T. Bisseling et al. (1983), supra), while the soybean study analyzed potentially truncated products of in vitro translation.
Verma and co-workers have also isolated soybean nodulin cDNA clones (F. Fuller et al. (1983) Proc. Natl. Acad. Sci. USA 80:2594-2598). Those clones were used to hybrid select NOD mRNAs from nodule RNA preparations; mRNAs of about 1150, 770, and 3150 nucleotides in length yielded in vitro translation products of 27, 24, and 100 kDa, respectively. Two additional clones, which shared some homology with each other, hybrid selected mRNAs of 1600 and 1100 nucleotides in length with in vitro translation products of 23.5 and 24.5 kDa, respectively (F. Fuller and D. P. S. Verma (1984) Plant Mol. Biol. 3;21-28) were identified.
Nodule mRNA from different stages of developing pea nodules was studied by in vitro translation of the RNA followed by separation of translation products by two dimensional gel electrophoresis. Twenty-one nodule-specific proteins were found, with molecular weights ranging from 15 to 80 kDa (F. Govers et al. (1985) EMBO J. 4:861-867).
Among the nodulins with known functions are leghemoglobin (C.A. Appleby (1984) Ann. Rev. Plant Physiol. 35:443-478), a nodule-specific glutamine synthetase (J. V. Callimore et al. (1983) Planta 157:245-253), and a nodule-specific form of uricase (M. Bergmann et al. (1983) EMBO J. 2:2333-2339). The functions of most nodulins have not been defined. Nodulins may have specific functions in the formation of nodule tissue after the dedifferentiation and proliferation of cortical cells, in the transport of substrates to the bacteroids, in the assimilation of ammonia excreted by the bacteroids, or in the senescence of nodule tissue.
A cDNA library prepared from mature (21 day) soybean root nodules infected with Bradyrhizobium japonicum has been analyzed for copies of mRNA transcripts of early (7 day) nodulin genes (Franssen et al. (1987) Proc. Natl. Acad. Sci. USA 84:4495-4499). These genes are expressed while the nodule structure is being formed. pEnod2, the cDNA clone whose insert encodes nodulin-75 (N-75) was sequenced. The 998 bp insert includes a short poly(A) tail, and encodes a proline-rich protein. Nodule mRNA of about 1200 nucleotides in length was hybrid-selected and translated in vitro to give two polypeptides each with an M.sub.r of about 75 kDa. The coding capacity of the mRNAs is significantly less than 75 kDa, but proline-rich proteins, such as collagen, are known to have anomalous behavior on polyacrylamide gels (J. W. Freytag et al. (1979) Biochemistry 18:4761-4768). N-75 expression was first detected at day 7 of nodule development, when nodule meristem emerges through the root epidermis with apparent expression increasing up to about day 13. Expression was observed in R. fredii-induced ineffective nodules without infection threads or bacteroids, so N-75 is likely to be involved in nodule morphogenesis rather than in the infection process per se (H. Franssen et al. (1987) Proc. Natl. Acad. Sci. USA 84:4495-4599).
There is a growing understanding of the DNA sequence elements which control gene expression. The following discussion applies to plant genes which are transcribed by polymerase II. There are known sequences which direct the initiation of mRNA synthesis, those which control transcription in response to environmental stimuli, those which modulate the level of transcription and there are those which regulate gene expression in a tissue-specific fashion.
Promoters are the portions of DNA sequence at the beginnings of genes, which contain the signals for RNA polymerase to begin transcription so that protein synthesis can then proceed. Eukaryotic promoters are complex, and are comprised of components which include a TATA box consensus sequence in the vicinity of -30 relative to the transcription start site (+1) (R. Breathnach and P. Chambon (1981) Ann. Rev. Biochem. 50:349-383; C. Kuhlemeier et al. (1987) Ann. Rev. Plant Physiol. 38:221-257). In plants there may be substituted for the CAAT box a consensus sequence which J. Messing et al. (1983) in Genetic Enqineering of Plants, T. Kosuge, C. Meredith, and A. Hollaender, eds., have termed the AGGA box, positioned a similar distance from the cap site (+1). Other sequences in the 5' regions of genes are known which regulate the expression of downstream genes. There are sequences which participate in the response to environmental conditions, such as illumination, nutrient availability, hyperthermia, anaerobiosis, or the presence of heavy metals. There are also signals which control gene expression during development, or in a tissue-specific fashion. Promoters are usually positioned 5' to, or upstream of, the start of the coding region of the corresponding gene, and the DNA tract containing the promoter sequences and the ancillary promoter-associated sequences affecting regulation or the absolute levels of transcription may be comprised of less than 100 bp or as much as 1 kbp.
As defined by G. Khoury and P. Gruss (1983) Cell 22:313-314, an enhancer is one of a set of eukaryotic promoter-associated elements that appears to increase transcriptional efficiency in a manner relatively independent of position and orientation with respect to the nearby gene. The prototype enhancer is found in the animal virus SV40. Generally animal or animal virus enhancers can function over a distance as much as 1 kbp 5', in either orientation, and can act 5' or 3' to the gene. The identifying sequence motif (5'-GTGGAAA(orTTT)G-3') is generally reiterated. There have been sequences identified in or adjacent to plant genes which have homology to the core consensus sequence of the SV40 enhancer, but the functional significance of these sequences in plants has not been determined.
There are also reports of enhancer-like elements 5' to certain constitutive and inducible genes of plants. J. Odell et al. (1985), Nature 313:810-812, describe a stretch of about 100 bp 5' to the start site of the CaMV 35S transcript which is necessary for increasing the level of expression of a reporter gene in chimeric constructions. Two different transcription activating elements which can function in plants are derived from the 780. gene and the ocs gene of Agrobacterium tumefaciens T-DNA (W. Bruce and W. Gurley (1987) Mol. Cell. Biol. 7:59-67; J. Ellis et al. (1987) EMBO J. 6:11-16). Regulated enhancer-like elements include those believed to mediate tissue-specific expression and response to illumination (M. Timko et al. (1985) Nature 318:579-582; H. Kaulen et al. (1986) EMBO J. 5:1-8; J. Simpson et al. (1985) EMBO J. 4:2723-2729; J. Simpson et al. (1986) Nature 323:551-554; R. Fluhr et al. (1986) Science 232:1106-1112).
The molecular mechanisms which regulate the expression of nodulin genes are not yet defined. V. P. Mauro et al. (1985) Nucleic Acids Res. 13:239-249, have analyzed the 5' flanking sequences of three nodulin genes of soybean for conserved DNA sequence motifs. They found three conserved sequence motifs: consensus sequence a 5'-GTTTCCCT-3' consensus sequence b 5'-GGTAGTG-3', and consensus sequence c 5'-TCTGGGAAA-3'. Whether these sequences function in the regulation of the nodulin genes is not known, and if they do, the stimuli which elicit expression are not known. The molecular mechanisms controlling the expression of Enod2 genes in soybean are also not known, but F. Govers et al. (1986) Nature 323:564-566, have shown that in developing pea root nodules, Rhizobium leguminosarum nod genes or adjacent genes carried on a 10 kb region of the Sym plasmid are involved in inducing an early nodulin gene which is homologous to the Enod2 gene of soybean.
Jensen et al. (1986) Nature 321:669-674, transformed the wild legume Lotus corniculatus with a Leghemoglobin-CAT chimeric construct. Roots were infected with a strain of Agrobacterium rhizogenes, and transformed plants containing the hybrid gene were obtained. Upon infection with Rhizobium loti, nodules were formed that expressed the introduced CAT gene in a fashion that was correct by all criteria applied.