The interface between plants and the environment plays a dual role as a protective barrier, as well as a medium for the exchange of gases, water and nutrients. The primary aerial plant surfaces (including leaves, stems, flowers, fruit) are covered by a cuticle, acting as a protective layer, which plays a role in regulating water loss and protects the plant against the surrounding environment (e.g. pathogen damage, insect damage, mechanical damage, UV radiation, frost) (Sieber et al. 2000, Plant Cell 12, 721-737). It is a heterogeneous layer composed mainly of lipids, namely cutin and intracuticular wax, with epicuticular waxes deposited on the surface and has an important role in regulating epidermal permeability and non-stomatal water loss. Without the protective cuticle, transpiration of most land plants would be so rapid that death would result. Cuticle metabolism and the structure of the epidermal surfaces are, therefore, crucial factors in determining plant water management and in protecting plants from environmental stress, both abiotic stresses (such as drought, freezing, salinity, wind, metals, etc.) and biotic stresses (such as plant pathogens or insects). In addition the cuticular layer also has a role in normal plant development processes including the prevention of post-genital organ fusion and pollen-pistil interactions and it has been suggested that cuticle permeability in such processes will also influence cell-to-cell communication by enhancing or attenuating the passage of signal molecules (Pruitt et al. 2000, PNAS USA 97, 1311-1316; Sieber at al. 2000, supra). Such signals could be, for example, required for organ adhesion (moving across the cuticle), or mediating signaling between trichomes and stomata (moving within the developing epidermis) (Lolle et al., 1997, Dev. Biol. 189, 311-321; Krolikowski et al., 2003, Plant J. 35, 501-511).
As tolerance to biotic and abiotic stresses has a direct impact on plant productivity (yield and product quality), mechanisms for conferring or enhancing stress tolerance have been widely studied and various approaches for conferring environmental stress tolerance have been described in the art. One of the most serious abiotic stresses plants have to cope with world-wide is drought stress or dehydration stress. Four-tenths of the world's agricultural land lies in arid or semi-arid regions. Apart from that, also plants grown in regions with relatively high precipitation may suffer spells of drought throughout the growing season. Many agricultural regions, especially in developing countries, have consistently low rain-fall and rely on irrigation to maintain yields. Water is scarce in many regions and its value will undoubtedly increase with global warming, resulting in an even greater need for drought tolerant crop plants, which maintain yield levels (or even have higher yields) and yield quality under low water availability. It has been estimated that the production of 1 kg of cotton requires about 15,000 litres of water in irrigated agriculture, while 1 kg of rice requires 4000 litres. Conferring or enhancing the tolerance of crop plants to short and long spells of drought and reducing the water requirement of crops grown in irrigated agriculture is clearly an important objective.
Although breeding (e.g. marker assisted) for drought tolerance is possible and is being pursued for a range of crop species (mainly cereals, such as maize, upland rice, wheat, sorghum, pearl millet, but also in other species such as cowpea, pigeon pea and Phaseolus bean), it is extremely difficult and tedious because drought tolerance or resistance is a complex trait, determined by the interaction of many loci and gene-environment interactions. Single, dominant genes, which confer or improve drought tolerance and which can be easily transferred into high yielding crop varieties and breeding lines are therefore sought after. Most water is lost through the leaves, by transpiration, and many transgenic approaches have focused on modifying the water loss through changing the leaves. For example WO00/73475 describes the expression of a C4 NADP+-malic enzyme from maize in tobacco epidermal cells and guard cells, which, according to the disclosure, increases water use efficiency of the plant by modulating stomatal aperture. Other approaches involve, for example, the expression of osmo-protectants, such as sugars (e.g. trehalose biosynthetic enzymes) in plants in order to increase water-stress tolerance, see e.g. WO99/46370. Yet other approaches have focused on changing the root architecture of plants.
To date another promising approach to enhance drought tolerance is the overexpression of CBF/DREB genes (DREB refers to dehydration response element binding; DRE binding), encoding various AP2/EREBP (ethylene response element binding protein) transcription factors (WO98/09521). Overexpression of the CBF/DREB1 proteins in Arabidopsis resulted in an increase in freezing tolerance (also referred to as freeze-induced dehydration tolerance) (Jaglo-Ottosen et al., Science 280, 104-106, 1998; Liu et al., Plant Cell 10, 1391-1406, 1998; Kasuga et al., Nat. Biotechnol. 17, 287-291, 1999; Gilmour et al. Plant Physiol. 124, 1854-1865, 2000) and enhanced the tolerance of the recombinant plants to dehydration caused either by water deficiency or exposure to high salinity (Liu et al., 1998, supra; Kasuga et al., 1999, supra). Another CBF transcription factor, CBF4, has been described to be a regulator of drought adaptation in Arabidopsis (Haake et al. 2002, Plant Physiology 130, 639-648).
Despite the availability of some genes which have been shown to enhance drought tolerance in a number of plant species, such as Brassicaceae and Solanaceae, there is a need for the identification of other genes with the ability to confer or improve drought tolerance when expressed in crop plants. In one embodiment, the present invention provides a new family of genes and proteins which fulfil this need.
Apart from the cuticle, forming a protective layer between the leaves and the environment, plants form a range of other protective or cell-separating layers, such as “dehiscence zones” and suberin layers. Dehiscence zones are cell layers formed during cell wall separation processes, such as the abscission of leaves, flowers, fruits (e.g. pods or siliques) or in anther dehiscence. Brassicaceae produces fruits in the form of pods (siliques) in which the two carpel valves (ovary walls) are joined to the replum, a visible suture that divides the two carpels. The dehiscence zone is a layer of only one to three cells in width that extends along the entire length of the valve/replum boundary (Meakin and Roberts, 1990, J. Exp. Botany 41: 995-1002). As the cells in the dehiscence zone separate from one another, the valves detach from the replum, allowing seeds to be dispersed (often prematurely), which is referred to as podshatter or seedshatter. Premature shattering causes significant yield losses in Brassica species, such as Brassica napus (oilseed rape or “canola” if erucic acid and glucosinolate levels are below a certain threshold value). As breeding for shatter resistance is virtually impossible, due to lack of genetic variation in this trait, transgenic approaches are being explored in order to confer shatter resistance to pod-bearing plants, such as Brassica napus or soybean. To date such approaches involve for example a gene referred to as “indehiscent 1” (IND1), identified in Arabidopsis (see WO017951), MADS-Box genes AGL1, AGL5 and AGL8 (FUL) (WO99/00503), or the SGT10166 gene (WO0159122). One of the difficulties in transgenic podshatter approaches is that on the one hand it is desired to prevent easy separation of the two pod valves, on the other hand it must still remain possible to separate the valves in order to harvest the seeds.
Another dehiscence process in flowering plants is anther dehiscence, whereby the anther opens to release pollen grains into the environment. Two processes are believed to contribute to anther dehiscence, namely splitting of the anther wall which occurs at the stomium, a specialised group of cell types running the length of the anther, and the inversion of the anther walls which exposes the pollen. Splitting of the anther wall involves cell-to-cell separation at the stomium. Anther development and dehiscence involves many genes, see for an overview Goldberg et al., 1993 (The Plant Cell Vol. 5, 1217-1229). The reduction or prevention of pollen release from plants, or a change in the time point of pollen release, has significant benefits, such as the production of male sterile plants (useful, for example, for hybrid seed production, see WO9626283; Mariani et al. 1990, Nature 347, 737-741; Mariani et al. 1992, Nature 357, 384-387) or prevention (or reduction) of pollen release where this is undesirable, as for example because of risks of allergenicity or risks of releasing pollen of transgenic plants into the environment. Recombinant approaches used to date to confer male sterility involve for example the tissue specific expression of genes encoding cytotoxic proteins, such as the barnase gene (Mariani et al. 1990 and 1992, supra), leading to a selective destruction of specific cell types during anther development (e.g. the tapetum layer).
However, there is still a need to identify novel genes which are suitable to confer shatter resistance or male sterility to plants, especially to crop plants. In one embodiment, the present invention provides a new family of genes and proteins which fulfil this need.
As mentioned above, another protective layer formed in plants is the suberin layer, which is functionally related to the cutin layer and also prevents water loss from specific tissues, blocks pathogen invasion and strengthens the cell wall. Suberin is formed as a protective layer on underground plant cell surfaces such as the root endodermis and also as a strengthening component in cell walls, for example in the root as a Casparian strip in the cell wall of the root endodermis and in bundle sheath cells of grasses. It also covers the cork cells formed in tree bark and is deposited as scar tissue after wounding, for example as a protective layer after leaf abscission or on the surface of wounded potato tubers (Kolattukudy 1981, Ann. Rev. Plant Physiol.; Nawrath 2002, The biopolymers cutin and suberin, “The Arabidopsis Book”, Eds. C. R. Sommerville and E. M. Meyerowitz, American Society of Plant Biologists, Rockville, Md.). Similar to cutin, suberin consists of a complex mixture of fatty acids and further contains phenolic compounds, such as ferulic acid. Genes involved in suberization and which are useful in modifying suberin formation in plants are generally desirable, for example for improving wound healing properties of tubers or strengthening root formation.
The prior art shows that there is a continuous need for novel genes and methods which are useful for the modification of plant protective layers (epidermis and cuticle, suberin layers) and cell layers involved in cell-to-cell separation processes. The present invention provides a novel class of genes which influence the formation and metabolism of the interface between the plant surface and the environment (wounding sites, root cap cells and some organs at the epidermal layer) and of the interface between cells and cell layer above ground (e.g. dehiscence zones and abscission zones) or below ground (e.g. the endodermis). In addition, the present invention discloses how to use this class of genes to generate plants with novel phenotypes, especially drought tolerance or resistance, male sterility, seed shatter resistance, fruit (e.g. tomatoes) with more solid flesh and a higher concentration of soluble solids, plants (especially tubers) with improved wound healing properties or woody trees with enhanced suberization of cork cells.
General Definitions
The term “nucleic acid sequence” (or nucleic acid molecule) refers to a DNA or RNA molecule in single or double stranded form, particularly a DNA encoding a protein or protein fragment according to the invention. An “isolated nucleic acid sequence” refers to a nucleic acid sequence which is no longer in the natural environment from which it was isolated, e.g. the nucleic acid sequence in a bacterial host cell or in the plant nuclear or plastid genome.
The terms “protein” or “polypeptide” are used interchangeably and refer to molecules consisting of a chain of amino acids, without reference to a specific mode of action, size, 3 dimensional structure or origin. A “fragment” or “portion” of a SHINE protein may thus still be referred to as a “protein”. An “isolated protein” is used to refer to a protein which is no longer in its natural environment, for example in vitro or in a recombinant bacterial or plant host cell.
The term “gene” means a DNA sequence comprising a region (transcribed region), which is transcribed into an RNA molecule (e.g. an mRNA) in a cell, operably linked to suitable regulatory regions (e.g. a promoter). A gene may thus comprise several operably linked sequences, such as a promoter, a 5′ leader sequence comprising e.g. sequences involved in translation initiation, a (protein) coding region (cDNA or genomic DNA) and a 3′non-translated sequence comprising e.g. transcription termination sites.
A “chimeric gene” (or recombinant gene) refers to any gene, which is not normally found in nature in a species, in particular a gene in which one or more parts of the nucleic acid sequence are present that are not associated with each other in nature. For example the promoter is not associated in nature with part or all of the transcribed region or with another regulatory region. The term “chimeric gene” is understood to include expression constructs in which a promoter or transcription regulatory sequence is operably linked to one or more coding sequences or to an antisense (reverse complement of the sense strand) or inverted repeat sequence (sense and antisense, whereby the RNA transcript forms double stranded RNA upon transcription).
“Expression of a gene” refers to the process wherein a DNA region, which is operably linked to appropriate regulatory regions, particularly a promoter, is transcribed into an RNA, which is biologically active, i.e. which is capable of being translated into a biologically active protein or peptide (or active peptide fragment) or which is active itself (e.g. in posttranscriptional gene silencing or RNAi). An active protein in certain embodiments refers to a protein having a dominant-negative function due to a repressor domain being present. The coding sequence is preferably in sense-orientation and encodes a desired, biologically active protein or peptide, or an active peptide fragment. In gene silencing approaches, the DNA sequence is preferably present in the form of an antisense DNA or an inverted repeat DNA, comprising a short sequence of the target gene in antisense or in sense and antisense orientation. “Ectopic expression” refers to expression in a tissue in which the gene is normally not expressed.
A “transcription regulatory sequence” is herein defined as a nucleic acid sequence that is capable of regulating the rate of transcription of a (coding) sequence operably linked to the transcription regulatory sequence. A transcription regulatory sequence as herein defined will thus comprise all of the sequence elements necessary for initiation of transcription (promoter elements), for maintaining and for regulating transcription, including e.g. attenuators or enhancers. Although mostly the upstream (5′) transcription regulatory sequences of a coding sequence are referred to, regulatory sequences found downstream (3′) of a coding sequence are also encompassed by this definition.
As used herein, the term “promoter” refers to a nucleic acid fragment that functions to control the transcription of one or more genes, located upstream with respect to the direction of transcription of the transcription initiation site of the gene, and is structurally identified by the presence of a binding site for DNA-dependent RNA polymerase, transcription initiation sites and any other DNA sequences, including, but not limited to transcription factor binding sites, repressor and activator protein binding sites, and any other sequences of nucleotides known to one of skill in the art to act directly or indirectly to regulate the amount of transcription from the promoter. A “constitutive” promoter is a promoter that is active in most tissues under most physiological and developmental conditions. An “inducible” promoter is a promoter that is physiologically (e.g. by external application of certain compounds) or developmentally regulated. A “tissue specific” promoter is only active in specific types of tissues or cells.
As used herein, the term “operably linked” refers to a linkage of polynucleotide elements in a functional relationship. A nucleic acid is “operably linked” when it is placed into a functional relationship with another nucleic acid sequence. For instance, a promoter, or rather a transcription regulatory sequence, is operably linked to a coding sequence if it affects the transcription of the coding sequence. Operably linked means that the DNA sequences being linked are typically contiguous and, where necessary to join two protein encoding regions, contiguous and in reading frame so as to produce a “chimeric protein”. A “chimeric protein” or “hybrid protein” is a protein composed of various protein “domains” (or motifs) which is not found as such in nature but which a joined to form a functional protein, which displays the functionality of the joined domains (for example DNA binding or repression leading to a dominant negative function). A chimeric protein may also be a fusion protein of two or more proteins occurring in nature. The term “domain” as used herein means any part(s) or domain(s) of the protein with a specific structure or function that can be transferred to another protein for providing a new hybrid protein with at least the functional characteristic of the domain. Specific domains can also be used to identify protein members belonging to the SHINE clade of transcription factors, such as SHINE orthologs from other plant species. Examples of domains found in SHINE proteins are the AP2 domain, the “mm” domain and the “cm” domain.
The terms “target peptide” refers to amino acid sequences which target a protein to intracellular organelles such as plastids, preferably chloroplasts, mitochondria, or to the extracellular space (secretion signal peptide). A nucleic acid sequence encoding a target peptide may be fused (in frame) to the nucleic acid sequence encoding the amino terminal end (N-terminal end) of the protein.
A “nucleic acid construct” or “vector” is herein understood to mean a man-made nucleic acid molecule resulting from the use of recombinant DNA technology and which is used to deliver exogenous DNA into a host cell. The vector backbone may for example be a binary or superbinary vector (see e.g. U.S. Pat. No. 5,591,616, US2002138879 and WO9506722), a co-integrate vector or a T-DNA vector, as known in the art and as described elsewhere herein, into which a chimeric gene is integrated or, if a suitable transcription regulatory sequence is already present, only a desired nucleic acid sequence (e.g. a coding sequence, an antisense or an inverted repeat sequence) is integrated downstream of the transcription regulatory sequence. Vectors usually comprise further genetic elements to facilitate their use in molecular cloning, such as e.g. selectable markers, multiple cloning sites and the like (see below).
A “host cell” or a “recombinant host cell” or “transformed cell” are terms referring to a new individual cell (or organism) arising as a result of at least one nucleic acid molecule, especially comprising a chimeric gene encoding a desired protein or a nucleic acid sequence which upon transcription yields an antisense RNA or an inverted repeat RNA (or hairpin RNA) for silencing of a target gene/gene family, having been introduced into said cell. The host cell is preferably a plant cell or a bacterial cell. The host cell may contain the nucleic acid construct as an extra-chromosomally (episomal) replicating molecule, or more preferably, comprises the chimeric gene integrated in the nuclear or plastid genome of the host cell.
The term “selectable marker” is a term familiar to one of ordinary skill in the art and is used herein to describe any genetic entity which, when expressed, can be used to select for a cell or cells containing the selectable marker. Selectable marker gene products confer for example antibiotic resistance, or more preferably, herbicide resistance or another selectable trait such as a phenotypic trait (e.g. a change in pigmentation) or a nutritional requirements. The term “reporter” is mainly used to refer to visible markers, such as green fluorescent protein (GFP), eGFP, luciferase, GUS and the like.
The term “ortholog” of a gene or protein refers herein to the homologous gene or protein found in another species, which has the same function as the gene or protein, but (usually) diverged in sequence from the time point on when the species harbouring the genes diverged (i.e. the genes evolved from a common ancestor by speciation). Orthologs of the Arabidopsis shn1, shn2 and shn3 genes may thus be identified in other plant species based on both sequence comparisons (e.g. based on percentages sequence identity over the entire sequence or over specific domains) and functional analysis.
The terms “homologous” and “heterologous” refer to the relationship between a nucleic acid or amino acid sequence and its host cell or organism, especially in the context of transgenic organisms. A homologous sequence is thus naturally found in the host species (e.g. a tomato plant transformed with a tomato gene), while a heterologous sequence is not naturally found in the host cell (e.g. a tomato plant transformed with a sequence from potato plants). Depending on the context, the term “homolog” or “homologous” may alternatively refer to sequences which are descendent from a common ancestral sequence (e.g. they may be orthologs).
“Stringent hybridisation conditions” can be used to identify nucleotide sequences, which are substantially identical to a given nucleotide sequence. Stringent conditions are sequence dependent and will be different in different circumstances. Generally, stringent conditions are selected to be about 5° C. lower than the thermal melting point (Tm) for the specific sequences at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridises to a perfectly matched probe. Typically stringent conditions will be chosen in which the salt concentration is about 0.02 molar at pH 7 and the temperature is at least 60° C. Lowering the salt concentration and/or increasing the temperature increases stringency. Stringent conditions for RNA-DNA hybridisations (Northern blots using a probe of e.g. 100 nt) are for example those which include at least one wash in 0.2×SSC at 63° C. for 20 min, or equivalent conditions. Stringent conditions for DNA-DNA hybridisation (Southern blots using a probe of e.g. 100 nt) are for example those which include at least one wash (usually 2) in 0.2×SSC at a temperature of at least 50° C., usually about 55° C., for 20 min, or equivalent conditions. See also Sambrook et al. (1989) and Sambrook and Russell (2001).
“Sequence identity” and “sequence similarity” can be determined by alignment of two peptide or two nucleotide sequences using global or local alignment algorithms. Sequences may then be referred to as “substantially identical” or “essentially similar” when they (when optimally aligned by for example the programs GAP or BESTFIT using default parameters) share at least a certain minimal percentage of sequence identity (as defined below). GAP uses the Needleman and Wunsch global alignment algorithm to align two sequences over their entire length, maximizing the number of matches and minimises the number of gaps. Generally, the GAP default parameters are used, with a gap creation penalty=50 (nucleotides)/8 (proteins) and gap extension penalty=3 (nucleotides)/2 (proteins). For nucleotides the default scoring matrix used is nwsgapdna and for proteins the default scoring matrix is Blosum62 (Henikoff & Henikoff, 1992, PNAS 89, 915-919). Sequence alignments and scores for percentage sequence identity may be determined using computer programs, such as the GCG Wisconsin Package, Version 10.3, available from Accelrys Inc., 9685 Scranton Road, San Diego, Calif. 92121-3752 USA. Alternatively percent similarity or identity may be determined by searching against databases such as FASTA, BLAST, etc.
In this document and in its claims, the verb “to comprise” and its conjugations is used in its non-limiting sense to mean that items following the word are included, but items not specifically mentioned are not excluded. In addition, reference to an element by the indefinite article “a” or “an” does not exclude the possibility that more than one of the element is present, unless the context clearly requires that there be one and only one of the elements. The indefinite article “a” or “an” thus usually means “at least one”. It is further understood that, when referring to “sequences” herein, generally the actual physical molecules with a certain sequence of subunits (e.g. amino acids) are referred to.