There are more than 300,000 species of plants. They show a wide diversity of forms, ranging from delicate liverworts, adapted for life in a damp habitat, to cacti, capable of surviving in the desert. The plant kingdom includes herbaceous plants, such as corn, whose life cycle is measured in months, to the giant redwood tree, which can live for thousands of years. This diversity reflects the adaptations of plants to survive in a wide range of habitats. This is seen most clearly in the flowering plants (phylum Angiospermophyta), which are the most numerous, with over 250,000 species. They are also the most widespread, being found from the tropics to the arctic.
The process of plant breeding involving man's intervention in natural breeding and selection is some 20,000 years old. It has produced remarkable advances in adapting existing species to serve new purposes. The world's economics was largely based on the successes of agriculture for most of these 20,000 years.
Plant breeding involves choosing parents, making crosses to allow recombination of gene (alleles) and searching for and selecting improved forms. Success depends on the genes/alleles available, the combinations required and the ability to create and find the correct combinations necessary to give the desired properties to the plant. Molecular genetics technologies are now capable of providing new genes, new alleles and the means of creating and selecting plants with the new, desired characteristics.
When the molecular and genetic basis for different plant characteristics are understood, a wide variety of polynucleotides, both endogenous polynucleotides and created variants, polypeptides, cells, and whole organisms, can be exploited to engineer old and new plant traits in a vast range of organisms including plants. These traits can range from the observable morphological characteristics, through adaptation to specific environments to biochemical composition and to molecules that the plants (organisms) exude. Such engineering can involve tailoring existing traits, such as increasing the production of taxol in yew trees, to combining traits from two different plants into a single organism, such as inserting the drought tolerance of a cactus into a corn plant. Molecular and genetic knowledge also allows the creation of new traits. For example, the production of chemicals and pharmaceuticals that are not native to particular species or the plant kingdom as a whole.
The application reports the inventions Applicants have discovered to build a foundation of scientific understanding of plant genomes to achieve these aims. These inventions include polynucleotide and polypeptide sequences, and data relating to where and when the genes are differentially expressed and phenotypic observations resulting from either aberrant gene activation or disruption. How these data are transformed into a scientific understanding of plant biology and the control of traits from a genetic perspective also is explained by the instant application. Applications of these discoveries to create new prototypes and products in the field of chemical, pharmaceutical, food, feed, and fiber production are described herein as well.
The achievements described in this application were possible because of the results from a cluster of technologies, a genomic engine, depicted below in FIG. 1, that allows information on each gene to be integrated to provide a more comprehensive understanding of gene structure and function and the deployment of genes and gene components to make new products.
I. The Discoveries of the Instant Application
Applicants have isolated and identified over one hundred thousand genes, gene components and their products and thousands of promoters. Specific genes were isolated and/or characterized from arabidopsis, soybean, maize, wheat and rice. These species were selected because of their economic value and scientific importance and were deliberately chosen to include representatives of the evolutionary divergent dicotyledonous and monocotyledonous groups of the plant kingdom. The number of genes characterized in this application represents a large proportion of all the genes in these plant species.
The techniques used initially to isolate and characterize most of the genes, namely sequencing of full-length cDNAs, were deliberately chosen to provide information on complete coding sequences and on the complete sequences of their protein products.
Gene components and products the Applicants have identified include exons, introns, promoters, coding sequences, antisense sequences, terminators and other regulatory sequences. The exons are characterized by the proteins they encode and arabidopsis promoters are characterized by their position in the genomic DNA relative to where mRNA synthesis begins and in what cells and to what extent they promote mRNA synthesis. Further exploitation of molecular genetics technologies has helped the Applicants to understand the functions and characteristics of each gene and their role in a plant. Three powerful molecular genetics approaches were used to this end:                (a) Analyses of the phenotypic changes when the particular gene sequence is interrupted or activated differentially; (arabidopsis)        (b) Analyses of in what plant organs, to what extent, and in response to what environmental signals mRNA is synthesized from the gene; (arabidopsis and maize) and        (c) Analysis of the gene sequence and its relatives. (all species)        
These were conducted using the genomics engine depicted in FIG. 1 that allows information on each gene to be integrated to provide a more comprehensive understanding of gene structure and function and linkage to potential products.
The species arabidopsis was used extensively in these studies for several reasons: (1) the complete genomic sequence, though poorly annotated in terms of gene recognition, was being produced and published by others and (2) genetic experiments to determine the role of the genes in planta are much quicker to complete.
The phenotypic tables, MA tables, and reference tables and sequence tables indicate the results of these analyses and thus the specific functions and characteristics that are ascribed to the genes and gene components and products.
II. Integration of Discoveries to Provide Scientific Understanding
From the discoveries made, Applicants have deduced the biochemical activities, pathways, cellular roles, and developmental and physiological processes that can be modulated using these components. These are discussed and summarized in sections based on the gene functions characteristics from the analyses and role in determining phenotypes. These sections illustrate and emphasize that each gene, gene component or product influences biochemical activities, cells or organisms in complex ways, from which there can be many phenotypic consequences.
An illustration of how the discoveries on gene structure, function, expression and phenotypic observation can be integrated together to understand complex phenotypes is provided in FIG. 2. This sort of understanding enables conclusions to be made as to how the genes, gene components and product are useful for changing the properties of plants and other organisms. This example also illustrates how single gene changes in, for example, a metabolic pathway can cause gross phenotypic changes.
Furthermore, the development and properties of one part of plant can be interconnected with other parts. The dependence of shoot and leaf development on root cells is a classic example. Here, shoot growth and development require nutrients supplied from roots, so the protein complement of root cells can affect plant development, including flowers and seed production. Similarly, root development is dependent on the products of photosynthesis from leaves. Therefore, proteins in leaves can influence root developmental physiology and biochemistry.
Thus, the following sections describe both the functions and characteristics of the genes, gene components and products and also the multiplicity of biochemical activities, cellular functions, and the developmental and physiological processes influenced by them. The sections also describe examples of commercial products that can be realized from the inventions.
A. Analyses to Reveal Function and In Vivo Roles of Single Genes in One Plant Species
The genomics engine has focused on individual genes to reveal the multiple functions or characteristics that are associated to each gene, gene components and products of the instant invention in the living plant. For example, the biochemical activity of a protein is deduced based on its similarity to a protein of known function. In this case, the protein may be ascribed with, for example, an oxidase activity. Where and when this same protein is active can be uncovered from differential expression experiments, which show that the mRNA encoding the protein is differentially expressed in response to drought and in seeds but not roots. The gene disruption experiments reveal that absence of the same protein causes embryo lethality.
Thus, this protein is characterized as a seed protein and drought-responsive oxidase that is critical for embryo viability.
B. Analyses to Reveal Function and Roles of Single Genes in Different Species
The genomics engine has also been used to extrapolate knowledge from one species to many plant species. For example, proteins from different species, capable of performing identical or similar functions, preserve many features of amino acid sequence and structure during evolution. Complete protein sequences have been compared and contrasted within and between species to determine the functionally vital domains and signatures characteristic of each of the proteins that is the subject of this application. Thus, functions and characteristics of arabidopsis proteins have been extrapolated to proteins containing similar domains and signatures of corn, soybean, rice and wheat and by implication to all other (plant) species.
FIG. 3 provides an example. Two proteins with related structures, one from corn, a monocot, and one from arabidopsis, a dicot, have been concluded to be orthologs. The known characteristics of the arabidopsis protein (seed protein, drought responsive oxidase) can then be attributed to the corn protein.
C. Analyses Over Multiple Experiments to Reveal Gene Networks and Links Across Species
The genomics engine can identify networks or pathways of genes concerned with the same process and hence linked to the same phenotype(s). Genes specifying functions of the same pathway or developmental environmental responses are frequently co-regulated i.e. they are regulated by mechanisms that result in coincident increases or decreases for all gene members in the group. The Applicants have divided the genes of arabidopsis and maize into such co-regulated groups on the basis of their expression patterns and the function of each group has been deduced. This process has provided considerable insight into the function and role of thousands of the plant genes in diverse species included in this application.
D. Applications of Applicant's Discoveries
It will be appreciated while reading the sections that the different experimental molecular genetic approaches focused on different aspects of the pathway from gene and gene product through to the properties of tissues, organs and whole organisms growing in specific environments. For each endogenous gene, these pathways are delineated within the existing biology of the species. However, Applicants' inventions allow gene components or products to be mixed and matched to create new genes and placed in other cellular contexts and species, to exhibit new combinations of functions and characteristics not found in nature, or to enhance and modify existing ones. For instance, gene components can be used to achieve expression of a specific protein in a new cell type to introduce new biochemical activities, cellular attributes or developmental and physiological processes. Such cell-specific targeting can be achieved by combining polynucleotides encoding proteins with any one of a large array of promoters to facilitate synthesis of proteins in a selective set of plant cells. This emphasizes that each gene, component and protein can be used to cause multiple and different phenotypic effects depending on the biological context. The utilities are therefore not limited to the existing in vivo roles of the genes, gene components, and gene products.
While the genes, gene components and products disclosed herein can act alone, combinations are useful to modify or modulate different traits. Useful combinations include different polynucleotides and/or gene components or products that have (1) an effect in the same or similar developmental or biochemical pathways; (2) similar biological activities; (3) similar transcription profiles; or (4) similar physiological consequences.
Of particular interest are the transcription factors and key factors in regulatory transduction pathways, which are able to control entire pathways, segments of pathways or large groups of functionally related genes. Therefore, manipulation of such proteins, alone or in combination is especially useful for altering phenotypes or biochemical activities in plants. Because interactions exist between hormone, nutrition, and developmental pathways, combinations of genes and/or gene products from these pathways also are useful to produce more complex changes. In addition to using polynucleotides having similar transcription profiles and/or biological activities, useful combinations include polynucleotides that may exhibit different transcription profiles but which participate in common or overlapping pathways. Also, polynucleotides encoding selected enzymes can be combined in novel ways in a plant to create new metabolic pathways and hence new metabolic products.
The utilities of the various genes, gene components and products of the Application are described below in the sections entitled as follows:    I. Organ Affecting Genes, Gene Components, Products (Including Differentiation Function)            I.A. Root Genes, Gene Components And Products                    I.A.1. Root Genes, Gene Components And Products            I.A.2. Root Hair Genes, Gene Components And Products                        I.B. Leaf Genes, Gene Components And Products                    I.B.1. Leaf Genes, Gene Components And Products            I.B.2. Trichome Genes And Gene Components            I.B.3. Chloroplast Genes And Gene Components                        I.C. Reproduction Genes, Gene Components And Products        I.C.1. Reproduction Genes, Gene Components And Products                    I.C.2. Ovule Genes, Gene Components And Products            I.C.3. Seed And Fruit Development Genes, Gene Components And Products                        I.D. Development Genes, Gene Components And Products                    I.D.1. Imbibition and Germination Responsive Genes, Gene Components And Products            I.D.2. Early Seedling Phase Genes, Gene Components And Products            I.D.3. Size and Stature Genes, Gene Components And Products            I.D.4. Shoot-Apical Meristem Genes, Gene Components And Products            I.D.5. Vegetative-Phase Specific Responsive Genes, Gene Components And Products                            II. Hormones Responsive Genes, Gene Components And Products            II.A. Abscissic Acid Responsive Genes, Gene Components And Products        II.B. Auxin Responsive Genes, Gene Components And Products        II.C. Brassinosteroid Responsive Genes, Gene Components And Products        II.D. Cytokinin Responsive Genes, Gene Components And Products        II.E. Gibberellic Acid Responsive Genes, Gene Components And Products            III. Metabolism Affecting Genes, Gene Components And Products            III.A. Nitrogen Responsive Genes, Gene Components And Products        III.B. Circadian Rhythm Responsive Genes, Gene Components And Products        III.C. Blue Light (Phototropism) Responsive Genes, Gene Components And Products        III.D. Co2 Responsive Genes, Gene Components And Products        III.E. Mitochondria Electron Transport Genes, Gene Components And Products        III.F. Protein Degradation Genes, Gene Components And Products        III.G. Carotenogenesis Responsive Genes, Gene Components And Products            IV. Viability Genes, Gene Components And Products            IV.A. Viability Genes, Gene Components And Products        IV.B. Histone Deacetylase (Axel) Responsive Genes, Gene Components And Products            V. Stress Responsive Genes, Gene Components And Products            V.A. Cold Responsive Genes, Gene Components And Products        V.B. Heat Responsive Genes, Gene Components And Products        V.C. Drought Responsive Genes, Gene Components And Products        V.D. Wounding Responsive Genes, Gene Components And Products        V.E. Methyl Jasmonate Responsive Genes, Gene Components And Products        V.F. Reactive Oxygen Responsive Genes, Gene Components And H2O2 Products        V.G. Salicylic Acid Responsive Genes, Gene Components And Products        V.H. Nitric Oxide Responsive Genes, Gene Components And Products        V.I. Osmotic Stress Responsive Genes, Gene Components And Products        V.J. Aluminum Responsive Genes, Gene Components And Products        V.K. Cadmium Responsive Genes, Gene Components And Products        V.L. Disease Responsive Genes, Gene Components And Products        V.M. Defense Responsive Genes, Gene Components And Products        V.N. Iron Responsive Genes, Gene Components And Products        V.O. Shade Responsive Genes, Gene Components And Products        V.P. Sulfur Responsive Genes, Gene Components And Products        V.Q. Zinc Responsive Genes, Gene Components And Products            VI. Enhanced Foods    VII. Pharmaceutical Products    VIII. Precursors Of Industrial Scale Compounds    IX. Promoters As Sentinels