This invention relates to nucleic acid molecule useful for producing plants having virus resistance characteristics, and transgenic plants expressing these nucleic acid molecules.
Plants are hosts to thousands of infectious diseases caused by a vast array of phytopathogenic fungi, bacteria, viruses, and nematodes. These pathogens are responsible for significant crop losses worldwide, resulting from both infection of growing plants and destruction of harvested crops.
Plants recognize and resist many invading phytopathogens by inducing a rapid defense response, termed the hypersensitive response (HR). HR results in localized cell and tissue death at the site of infection, which constrains further spread of the infection. This local response often triggers non-specific resistance throughout the plant, a phenomenon known as systemic acquired resistance (SAR). Once triggered, SAR provides resistance for days to a wide range of pathogens. The generation of the HR and SAR in a plant depends upon the interaction between a dominant or semi-dominant resistance (R) gene product in the plant and a corresponding dominant avirulence (Avr) gene product expressed by the invading phytopathogen. It has been proposed that phytopathogen Avr products function as ligands, and that plant R products function as receptors. Thus, in the widely accepted model of phytopathogen/plant interaction, binding of the Avr product of an invading pathogen to a corresponding R product in the plant initiates the chain of events within the plant that produces HR and SAR and ultimately leads to disease resistance.
Since the cloning of the first R gene, Pto from tomato, which confers resistance to Pseudomonas syringae pv. tomato (Martin et al., 1993), a number of other R genes have been reported (Hammond-Kosack and Jones, 1997). Much effort is currently being directed towards using these genes to engineer pathogen resistance in plants. The production of transgenic plants carrying a heterologous gene sequence is now routinely practiced by plant molecular biologists. Methods for incorporating an isolated gene sequence into an expression cassette, producing plant transformation vectors, and transforming many types of plants are well known. Examples of the production of transgenic plants having modified characteristics as a result of the introduction of a heterologous transgene include: U.S. Pat. No. 5,719,046 to Guerineau (production of herbicide resistant plants by introduction of bacterial dihydropteroate synthase gene); U.S. Pat. No. 5,231,020 to Jorgensen (modification of flavenoids in plants); U.S. Pat. No. 5,583,021 to Dougherty (production of virus resistant plants); and U.S. Pat. No. 5,767,372 to De Greve and U.S. Pat. No. 5,500,365 to Fischoff (production of insect resistant plants by introducing Bacillus thuringiensis genes).
In conjunction with such techniques, the isolation of plant R genes has similarly permitted the production of plants having enhanced resistance to certain pathogens. A number of these genes have been used to introduce the encoded resistance characteristic into plant lines that were previously susceptible to the corresponding pathogen. For example, U.S. Pat. No. 5,571,706 to Baker describes the introduction of the N gene into tobacco lines that are susceptible to Tobacco Mosaic Virus (TMV) in order to produce TMV-resistant tobacco plants. WO 95/28423 describes the creation of transgenic plants carrying the Rps2 gene from Arabidopsis thaliana, as a means of creating resistance to bacterial pathogens including Pseudomonas syringae, and WO 98/02545 describes the introduction of the Prf gene into plants to obtain broad-spectrum pathogen resistance. Cao et al. (1998) describes the introduction into Arabidopsis of the NPR1 cDNA expressed under the control of the 35S promoter to produce enhanced resistance to multiple bacterial pathogens.
The first R gene conferring virus resistance to be isolated from plants was the N gene of Nicotiana glutinosa tobacco (Whitham et al., 1994). The N gene (or homologs of this gene) is present in some but not all types of tobacco, and confers resistance to Tobacco Mosaic Virus (TMV). TMV is an important pathogen of not only tobacco, but also of other crop plants including tomato (Lycopersicon sp.) and pepper (Capsicum sp.). A review of the wide range of host species that serve as hosts to TMV is presented in Holmes (1946). TMV is the type virus of the genus Tobamovirus, which includes a number of closely related viral pathogens of commercially important plants. For example, the Tobamovirus group includes tomato mosaic virus, pepper green mottle virus and ondontoglossum ringspot virus, which is a pathogen of orchids (Agrios, 1997).
The N. glutinosa N gene is described in detail in U.S. Pat. No. 5,571,706 (xe2x80x9cPlant Virus Resistance Gene and Methodsxe2x80x9d) to Baker and Whitham, which is incorporated herein by reference. The sequence of this gene is available on GenBank under accession number U558886. U.S. Pat. No. 5,571,706 discloses the sequence of the N gene, as well as two cDNAs corresponding to the gene. The N gene (including the 5xe2x80x2 and 3xe2x80x2 regulatory regions) is over 12 kb in length and comprises five exons and four introns, encoding a full length N protein of 1144 amino acids, with a deduced molecular mass of 131.4 kDa. cDNA-N is a cDNA encoded by the N gene; it is approximately 3.7 kb in length and encodes the full length N protein. A second cDNA, cDNA-N-tr, is approximately 3.8 kb in length. It results from an alternative splicing pattern and encodes a truncated protein, N-tr, that is 652 amino acids in length and has a deduced molecular mass of 75.3 kDa. U.S. Pat. No. 5,571,706, and Whitham et al (1994) describe the production of transgenic tobacco plants carrying a full-length N transgene; these plants show the HR response following TMV challenge.
The inventors have discovered that while the introduction of the full length N gene into a plant results in TMV resistance, introduction of the full length N cDNA (cDNA-N) does not. Neither, it has been discovered, does introduction of cDNA-N-tr or the combination of cDNA-N-tr and cDNA-N. In particular, while plants containing the cDNA sequences exhibit HR in response to a TMV infection, the virus spreads systemically throughout the plants, suggesting that the normal SAR is not triggered.
Use of the shorter cDNA sequences rather than the full gene sequence would be advantageous because the shorter length makes manipulating the sequence easier, and reduces the likelihood that errors will be introduced into the sequence either during laboratory manipulation, or in the plant transformation process. To that end, the inventors have produced a form of the cDNA that does produce TMV resistance when introduced into plants. In this context, TMV resistance refers to the ability of a plant to resist systemic spread of the virus.
The inventors have identified a critical intron region of the N gene that is required for TMV resistance. cDNA-N constructs including this intron region (termed cDNA-N/intron constructs) are able to confer TMV resistance on otherwise susceptible plants. The intron region that is required for a cDNA-N to confer TMV resistance is contained within intron 3 (I13) of the N gene, and includes the 70 base pair alternative exon (AE) that is included within cDNA-N-tr and encodes part of the N-tr protein.
The structural region of the N gene (the sequence of which is shown in Seq. ID No. 1) comprises a series of exons (E) and introns (I) that may be schematically illustrated as follows:
E1-I1-E2-I2-E3-I3-E4-I4-E5
cDNA-N comprises the structural N gene sequence with the introns omitted, and may therefore be represented as:
E1-E2-E3-E4-E5.
The inventors have discovered that inclusion of I13 in the cDNA-N sequence in its naturally occurring position (i.e., between E3 and E4) restores the ability to encode TMV resistance. Thus, one possible cDNA-N/intron construct that may be employed is represented as:
E1-E2-E3-I3-E4-E5 (SEQ ID NO:16)
As discussed in detail below, while inclusion of the entire I3 sequence into a cDNA-N/intron construct is effective to confer TMV resistance, less than the entire I13 sequence may be employed, providing that the 70 base pair AE sequence within I13 is retained and splice acceptor sites for the intron are included. Other sequences may also be included in such constructs, including other N gene introns. For example, one or more of introns I1, I2 and I4, or portions of such sequences, may be added to the construct. Possible combinations include:
E1-I1-E2-E3-I3-E4-E5 (SEQ ID NO:17)
E1-E2-I2-E3-I3-E4-E5 (SEQ ID NO:18)
E1-E2-E3-I3-E4-I4-E5 (SEQ ID NO:19)
E1-I1-E2-I2-E3-I3-E4-E5 (SEQ ID NO:20)
E1-I1-E2-E3-I3-E4-I4-E5 (SEQ ID NO:21)
E1-E2-I2-E3-I3-E4-I4-E5 (SEQ ID NO:22)
In addition to the intron and exon sequences, such constructs require the presence of 5xe2x80x2 and 3xe2x80x2 regulatory regions. The N gene 5xe2x80x2 and 3xe2x80x2 regulatory regions may be employed for this purpose, and may be the most effective since they will confer regulatory control on the cDNA-N/intron constructs that is substantially similar to the regulatory control of N gene expression. Other 5xe2x80x2 regulatory regions such as the CaMV35S promoter sequence are well known in the art and may also be effective to confer TMV resistance. A number of 3xe2x80x2 regulatory regions may be employed, but not all such regions may be effective. For example, it is shown that a construct comprising cDNA-N/intron 3 operably linked at its 5xe2x80x2 end to the N gene promoter (p/V) (included within a ca. 4.2 kb 5xe2x80x2 regulatory region) and at its 3xe2x80x2 end to a ca. 1.3 kb region of the N gene 3xe2x80x2 regulatory sequence (3xe2x80x2-GRS) confers resistance to TMV when introduced into otherwise susceptible tobacco plants. In contrast, the same construct in which the 3xe2x80x2-GRS sequence is replaced with the NOS 3xe2x80x2 regulatory region does not confer resistance.
Thus, in one embodiment, nucleic acid molecules produced by the inventors comprise cDNA-N with, positioned between the sequences corresponding to exons 3 and 4 of the N gene, the third intron (I3) of the N gene. In another embodiment, the nucleic acid molecule further comprises the 3xe2x80x2 regulatory sequence from the N gene (3xe2x80x2-GRS), which regulatory sequence is disclosed herein. While the entire ca. 1.3 kb of 3xe2x80x2-GRS may be employed, less than this entire sequence may also be used in such constructs in order to obtain TMV resistance. The nucleic acid molecule may further comprise the N promoter sequence (pN) contained within the ca. 4.2 kb 5xe2x80x2 regulatory region of the N gene, which is also disclosed herein. Again, while the entire ca. 4.2 kb sequence may be employed, less than this entire sequence may also be used in such constructs in order to obtain TMV resistance.
Introduction of the cDNA-N/intron constructs into plants may be used to confer resistance to plant viruses including TMV and other Tobamoviruses, such as tomato mosaic virus, pepper green mottle virus and ondontoglossum ringspot virus. Suitable plant species for transformation with these constructs include solanaceaous plants such as tobacco, tomato, potato and pepper, as well as other plant species, such as orchids, that are host to Tobamoviruses or other plant viruses. Transgenic plants that comprise the disclosed cDNA-N/intron constructs are encompassed by this invention.
Seq. ID No. 1 shows the nucleic acid sequence of the N. glutinosa N gene. The sequence comprises the following regions:
Seq. ID No. 2 shows the nucleic acid sequence of the N. glutinosa cDNA-N.
Seq. ID No. 3 shows the amino acid sequence of the N. glutinosa N protein.
Seq. ID No. 4 shows the nucleic acid sequence of the N. glutinosa cDNA-N-tr.
Seq. ID No. 5 shows the amino acid sequence of the N. glutinosa N-tr protein.
Seq. ID No. 6 shows the nucleic acid sequence of the N. glutinosa intron 3.
The alternative exon (AE) spans from nucleotides 117-186.
Seq. ID No. 7 shows the nucleic acid sequence of the ca. 1.3 kb N. glutinosa 3xe2x80x2-GRS.
Seq. ID No. 8 shows the nucleic acid sequence of the ca. 4.2 kb N. glutinosa pN.
Seq. ID No. 9 shows the nucleic acid sequence of pN/cDNA-N/ intron 3/3xe2x80x2-GRS.
Seq. ID No. 10-15 show primers that may be used to amplify N nucleic acids.
I. Definitions
Unless otherwise noted, technical terms are used according to conventional usage. Definitions of common terms in molecular biology may be found in Benjamin Lewin, Genes V, published by Oxford University Press, 1994 (ISBN 0-19-854287-9); Kendrew et al. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0-632-02182-9); and Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 1-56081-569-8).
In order to facilitate review of the various embodiments of the invention, the following definitions of terms are provided:
N gene: A gene that encodes an N and an N-tr protein, and which when introduced into a plant enhances the resistance of that plant to TMV infection. The prototypical N gene is the gene isolated from N. glutinosa and disclosed in U.S. Pat. No. 5,571,706. The sequence of this gene, including 5xe2x80x2 and 3xe2x80x2 regulatory regions is shown in Seq. ID No. 1). The ability of an N gene to confer TMV resistance may readily be determined by scoring the HR and SAR responses to TMV infection in transgenic plants, and by monitoring systemic spread of the virus, as disclosed below.
The disclosed N gene sequence is designated as the prototypical N gene since it is the first N gene to have been isolated. As discussed in U.S. Pat. No. 5,571,706, functional homologs of this gene from other plant species, such as from other Solanaceous species, may be obtained. Such homologs encode proteins having specified levels of sequence identity with the prototype N protein (e.g., at least 60% sequence identity), and retain N gene function, i.e., retain the ability to confer TMV resistance when introduced into plants. Similarly, the disclosed N and N-tr proteins are the prototypes of such proteins, and homologs of these proteins are encoded by N gene homologs. Accordingly, where reference is made herein to molecules relating to the N gene, for example, cDNA-N, N or N-tr proteins and introns of the N gene (such as I3), it will be understood that such reference includes not only the prototypical sequences of these molecules disclosed herein, but also corresponding sequences from N gene homologs. Also included within the scope of such terms are molecules that differ from the disclosed prototypical molecules by minor variations, such as nucleic acid molecules that vary from the disclosed sequences by virtue of the degeneracy of the genetic code, and nucleic acid sequences that have been modified to encode N or N-tr proteins having conservative amino acid substitutions. Such variant sequences may be produced by manipulating the nucleotide sequence of the tobacco cDNA-N or N gene using standard procedures such as site-directed mutagenesis or the polymerase chain reaction.
N tobacco: A tobacco line that carries at least one copy of an N gene. A plant that is homozygous for the N gene is designated NN, while a plant lacking a functional N gene is designated nn.
N protein/N-tr protein: Proteins encoded by an N gene. The N protein encoded by the prototypical N gene is shown in Seq. ID No. 3. The N-tr protein is a truncated form of the N protein and is encoded by an alternatively spliced form of the N gene; the prototypical sequence of N-tr is shown in Seq. ID No. 5. Expression of both forms of the protein in a plant cell is required for TMV resistance.
cDNA-N: A cDNA molecule that encodes an N protein. The nucleic acid sequence of the prototypical cDNA-N is shown in Seq. ID No. 2.
cDNA-N-tr: A cDNA molecule that encodes an N-tr protein. The nucleic acid sequence of the prototypical cDNA-N-tr is shown in Seq. ID No. 4.
cDNA-N/intron: A construct comprising a cDNA-N molecule and all or part of one or more N gene introns.
cDNA-N/intron 3: A construct comprising a cDNA-N molecule and all or part of an N gene intron 3 (I3) sequence (described in more detail below). The I3 sequence is typically situated in the cDNA at a position corresponding to the position of the intron in the N gene (i.e., between codons encoding Lys 616 and His 617 of the N protein).
N intron: An intron of an N gene. The prototypical N gene has four introns, I1, I2, I3 and I4. The sequences of these introns from the prototypical N gene are shown in Seq. ID No. 1. As discussed above, the invention may be practiced using these sequences or homologs of these sequences from N gene homologs, or variants on these sequences. The I3 intron is particularly relevant to the invention since it is the intron that is incorporated into cDNA-N/intron constructs to confer TMV resistance. While the entire I3 sequence as shown in Seq. ID No. 6 may be employed for this purpose, the biological activity of cDNA-N/intron constructs (i.e., enhancing TMV resistance) may also be obtained using less than the entire sequence. Reference to intron 3 (or I3) thus encompasses not only the entire intron 3 sequence of the prototypical N gene and its homologs and variants on this sequence, but also sequences that comprise less than the entire intron 3 sequence. At a minimum, the portion of the I3 sequence that is incorporated into cDNA-N/intron constructs is the alternative exon (AE) comprising nucleotides 117-186 of Seq. ID No. 6 and splice acceptor and donor sites. The splice two pairs of acceptor and donor sites for the AE within intron 3 comprise nucleotides 7200-7203 and 7316-7319 and 7386-7389 and 9018-9021 of Seq. ID No. 1. These sequences are quite similar to the consensus splice acceptor and donor sequences. In some other systems in which alternative splicing of exons has been reported, in addition to the splice acceptor and donor sites, a cis acting sequence is required. For example, two cis elements (GAAGAAGA and CAAGG) within the fibronectin AE modulate the exclusion or inclusion of the AE (Caputi et al. 1994). Sequences similar to these are located within the intron 3 AE of N and will be included within any intron 3 construct that is employed. TMV resistance may be obtained by including a greater portion of the I3 sequence, such as splice acceptor and donor sites together with nucleotides 100-200, 80-250, 50-300, or 1-500 or 1-1000 of Seq. ID No. 6, or indeed the entire I3 sequence. As described in Example 2 below, the pN/cDNA-N/intron 3/3xe2x80x2-GRS construct depicted in Seq. ID No. 9 confers TMV resistance in trangenic plants. Thus, in the context of this construct, the I3 sequence may be said to be biologically active (i.e., the construct produced TMV resistance when introduced into plants). One of skill in the art will be able to ascertain whether a particular sub-regions of an I3 confer biological activity by substituting such sequences for the I3 sequence in the cDNA-N/intron 3/3xe2x80x2-GRS construct, introducing the resulting sequence into plants and assessing resultant TMV resistance by analyzing HR and SAR responses, or by determining systemic spread of the virus. Accordingly, the term xe2x80x9cbiologically active intron 3xe2x80x3 refers to an intron 3 of an N gene, or a portion or variant of such an intron that, when incorporated into a pN/cDNA-N/intron 3/3xe2x80x2-GRS construct, and introduced into a plant, results in TMV resistance. 3xe2x80x2-GRS: The 3xe2x80x2 regulatory sequence of an N gene. The 3xe2x80x2-GRS of the prototypical N gene from tobacco is depicted in Seq. ID No. 7. For incorporation into cDNA-N/intron constructs, the entire 3xe2x80x2-GRS sequence shown in Seq. ID No. 7 (ca. 1.3 kb), or less than the entire sequence, may be utilized. As described in Example 2 below, a construct comprising pN/cDNA-N/intron 3 operably linked to the ca. 1.3 kb 3xe2x80x2-GRS sequence (the sequence of which is depicted in Seq. ID No. 9) confers TMV resistance in transgenic tobacco plants. Thus, in the context of this construct, the 1.3 kb 3xe2x80x2-GRS sequence may be said to be biologically active (i.e., the construct produced TMV resistance), whereas the NOS 3xe2x80x2 regulatory sequence in the context of the same construct does not have biological activity. One of skill in the art will be able to ascertain whether a particular sub-region of the 3xe2x80x2-GRS confers biological activity by incorporating such sequences into a cDNA-N/intron 3 construct, introducing the resulting sequence into plants and assessing resultant TMV resistance by analyzing HR and SAR responses, or determining whether systemic virus spread occurs. For example, a 3xe2x80x2 regulatory sequence comprising nucleotides 1-100, 1-150, 1-200, 1-500 or 1-1000 of the sequence shown in Seq. ID No. 7 may be utilized in a cDNA-N/intron 3 construct, and the degree to which such a construct enhances TMV resistance ascertained by the methods described herein. In addition, 3xe2x80x2 regulatory sequences from N gene homologs may also be employed. Thus, the term xe2x80x9cbiologically active 3xe2x80x2-GRSxe2x80x9d refers to a 3xe2x80x2 regulatory region of an N gene, or a part or a variant of such a region, that, when operably linked to the 3xe2x80x2 end of a pN/cDNA-N/intron 3 construct and introduced into a plant results in TMV resistance.
pN: The promoter region of an N gene. The pN of the prototypical N gene is depicted in Seq. ID No. 8. For incorporation into cDNA-N/intron constructs, the entire pN sequence shown in Seq. ID No. 8 (ca. 4.2 kb), or less than the entire sequence, may be utilized. As described in Example 2 below, a construct comprising the ca. 4.2 kb pN operably linked to the cDNA-N/intron/3xe2x80x2-GRS sequence (the sequence of which construct is depicted in Seq. ID No. 9) confers TMV resistance in transgenic tobacco plants. Thus, in the context of this construct, the ca. 4.2 kb pN sequence may be said to be biologically active (i.e., the construct produced TMV resistance). One of skill in the art will be able to ascertain whether a particular sub-region of pN confers biological activity by incorporating such sequences into a cDNA-N/intron 3/3xe2x80x2-GRS construct, introducing the resulting sequence into plants and assessing resultant TMV resistance by analyzing HR and SAR responses, or determining whether systemic virus spread occurs. For example, a 5xe2x80x2 regulatory sequence comprising nucleotides 4000-4281, 3500-4281, 2500-4281 or 2000-4281 of the sequence shown in Seq. ID No. 8 may be utilized in a cDNA-N/intron 3 construct, and the degree to which such a construct enhances TMV resistance ascertained by the methods described herein. In addition, 5xe2x80x2regulatory sequences from N gene homologs may also be employed. Thus, the term xe2x80x9cbiologically active pNxe2x80x9d refers to a 5xe2x80x2 regulatory region of an N gene, or a part or a variant of such a region, that, when operably linked to the 5xe2x80x2 end of a cDNA-N/intron 3/3xe2x80x2-GRS construct and introduced into a plant results in TMV resistance.
N exon: An exon of an N gene. The prototypical N gene has five exons, E1, E2, E3, E4, and E5.
Sequence identity: The similarity between two nucleic acid sequences, or two amino acid sequences is expressed in terms of the similarity between the sequences, otherwise referred to as sequence identity. Sequence identity is frequently measured in terms of percentage identity (or similarity or homlogy); the higher the percentage, the more similar the two sequences are. Homologs of the prototype N and N-tr proteins will possess a relatively high degree of sequence identity when aligned using standard methods.
Methods of alignment of sequences for comparison are well known in the art. Various programs and alignment algorithms are described in: Smith and Waterman (1981); Needleman and Wunsch (1970); Pearson and Lipman (1988); Higgins and Sharp (1988); Higgins and Sharp (1989); Corpet et al. (1988); Huang et al. (1992); and Pearson et al. (1994). Altschul et al. (1994) presents a detailed consideration of sequence alignment methods and homology calculations.
The NCBI Basic Local Alignment Search Tool (BLAST) (Altschul et al., 1990) is available from several sources, including the National Center for Biotechnology Information (NCBI, Bethesda, Md.) and on the Internet, for use in connection with the sequence analysis programs blastp, blastn, blastx, tblastn and tblastx.
Homologs of the disclosed prototype N and N-tr protein are typically characterized by possession of at least 60% sequence identity counted over the full length alignment with the amino acid sequence of the prototype using the NCBI Blast 2.0, gapped blastp set to default parameters. Proteins with even greater similarity to the reference sequences will show increasing percentage identities when assessed by this method, such as at least 70%, at least 75%, at least 80%, at least 90% or at least 95% of sequence identity. When less than the entire sequence is being compared for sequence identity, homologs will typically possess at least 75% sequence identity over short windows of 10-20 amino acids, and may possess sequence identities of at least 85% or at least 90% or 95% depending on their similarity to the reference sequence. Methods for determining sequence identity over such short windows are described at the NCBI Internet site. One of skill in the art will appreciate that these sequence identity ranges are provided for guidance only; it is entirely possible that strongly significant homologs could be obtained that fall outside of the ranges provided.
Nucleic acid hybridization: Another indication that two nucleic acid sequences share a high degree of similarity, for example, 50% or greater, is that the two molecules hybridize to each other under defined hybridization conditions. The defined hybridization conditions may be more or less stringent, with more stringent conditions allowing less non-specific binding, i.e., binding between pairs of nucleic acid probes that are not perfectly matched.
Because the degree to which two nucleic acids will bind is dependent upon their sequences, stringency is sequence dependent. Generally, stringency of hybridization is expressed with reference to the temperature under which the wash step is carried out. Generally, such wash temperatures are selected to be about 5xc2x0 C. to 20xc2x0 C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. An equation for calculating Tm and conditions for nucleic acid hybridization is well known and can be found in Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, 2nd ed., vol. 1-3, Cold Spring Harbor, N.Y.; specifically see volume 2, chapter 9.
Conditions for hybridization between nucleotides of the present invention (e.g., between two nucleotides showing substantial similarity) include wash conditions of 70xc2x0 C. and about 0.2xc3x97SSC for 1 hour, or alternatively, 65xc2x0 C., 60xc2x0 C., or 55xc2x0 C. and about 0.2 to 2xc3x97SSC (with, for instance, about 0.1% SDS) for 1 hour. Typically, blocking reagents are used to block non-specific hybridization. Such blocking reagents include, for instance, salmon sperm DNA. Hybridization, particularly under highly stringent conditions (e.g., Wash temperatures of 60xc2x0 C. or more and SSC concentrations of 0.2xc3x97) is suggestive of evolutionary similarity between the nucleotides. Such similarity (whether produced by convergent or divergent evolution) is strongly indicative of a similar role for the nucleotides and their resultant proteins.
Nucleic acid sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequence that all encode substantially the same protein.
Oligonucleotide: A linear polynucleotide sequence of up to about 100 nucleotide bases in length.
Vector: A nucleic acid molecule as introduced into a host cell, thereby producing a transformed host cell. A vector may include nucleic acid sequences that permit it to replicate in a host cell, such as an origin of replication. A vector may also include one or more selectable marker genes and other genetic elements known in the art.
Transformed: A transformed cell is a cell into which has been introduced a nucleic acid molecule by molecular biology techniques. As used herein, the term transformation encompasses all techniques by which a nucleic acid molecule might be introduced into such a cell, including transfection with viral vectors, transformation with plasmid vectors, and introduction of naked DNA by electroporation, lipofection, and particle gun acceleration.
Isolated: An xe2x80x9cisolatedxe2x80x9d biological component (such as a nucleic acid or protein or organelle) has been substantially separated or purified away from other biological components in the cell of the organism in which the component naturally occurs, i.e., other chromosomal and extra-chromosomal DNA and RNA, proteins and organelles. Nucleic acids and proteins that have been xe2x80x9cisolatedxe2x80x9d include nucleic acids and proteins purified by standard purification methods. The term also embraces nucleic acids and proteins prepared by recombinant expression in a host cell as well as chemically synthesized nucleic acids.
Purified: The term purified does not require absolute purity; rather, it is intended as a relative term. Thus, for example, a purified N protein preparation is one in which the N protein is more enriched than the protein is in its natural environment within a plant cell. Generally, a preparation of N protein is purified such that the N protein represents at least 5% of the total protein content of the preparation. For particular applications, higher purity may be desired, such that preparations in which the N protein represents at least 20% or at least 50% of the total protein content may be employed.
Ortholog: Two nucleotide or amino acid sequences are orthologs of each other if they share a common ancestral sequence and diverged when a species carrying that ancestral sequence split into two species. Orthologous sequences are also homologous sequences.
Operably linked: A first nucleic acid sequence is operably linked with a second nucleic acid sequence when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence. Generally, operably linked DNA sequences are contiguous and, where necessary to join two protein-coding regions, in the same reading frame.
Recombinant: A recombinant nucleic acid is one that has a sequence that is not naturally occurring or has a sequence that is made by an artificial combination of two otherwise separated segments of sequence. This artificial combination is often accomplished by chemical synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques.
cDNA (complementary DNA): A piece of DNA lacking internal, non-coding segments (introns) and regulatory sequences that determine transcription. cDNA is synthesized in the laboratory by reverse transcription from messenger RNA extracted from cells.
ORF (open reading frame): A series of nucleotide triplets (codons) coding for amino acids without any termination codons. These sequences are usually translatable into a peptide.
Transgenic plant: As used herein, this term refers to a plant that contains recombinant genetic material not normally found in plants of this type and which has been introduced into the plant in question (or into progenitors of the plant) by human manipulation. Thus, a plant that is grown from a plant cell into which recombinant DNA is introduced by transformation is a transgenic plant, as are all offspring of that plant that contain the introduced transgene (whether produced sexually or asexually).
II. Production of cDNA-Intron Constructs
The prototypical cDNA-N sequence may be amplified by the polymerase chain reaction (PCR) from a suitable cDNA library (e.g., one produced from TMV-infected N tobacco plants) or directly from TMV infected N tobacco plant cells by reverse transcription PCR (RT-PCR). The pN, I3 and 3xe2x80x2 regulatory sequences of this N gene may similarly be amplified directly from N tobacco genomic DNA, or from a genomic library of N tobacco. Methods and conditions for both direct PCR and RT-PCR are known in the art and are described in Innis et al. (1990).
The selection of PCR primers will be made according to the portions of cDNA-N (or the N gene) that are to be amplified. Primers may be chosen to amplify small segments of the CDNA, the open reading frame, all or part of the intron 3 sequence, all or part of the 1.3 kb 3xe2x80x2 regulatory sequence, all or part of the 4.2 kb 5xe2x80x2 regulatory sequence, all or part of the cDNA molecule or all or part of the N gene sequence. Variations in amplification conditions may be required to accommodate primers of differing lengths; such considerations are well known in the art and are discussed in Innis et al. (1990), Sambrook et al. (1989), and Ausubel et al (1992). By way of example only, the cDNA-N molecule as shown in Seq. ID No. 2 may be amplified using the following combination of primers:
Primer 1 5xe2x80x2GGCACGAGATTTTTTCACATACAG 3xe2x80x2 (Seq. ID No. 10)
Primer 2 5xe2x80x2AAGTAATATAGAGATGTTATTAC 3xe2x80x2 (Seq. ID No. 11)
The open reading frame portion of cDNA-N may be amplified using the following primer pair:
Primer 3 5xe2x80x2ATGGCATCTTCTTCTTCTTCTTCTAGATGG 3xe2x80x2 (Seq. ID No. 12)
Primer 4 5xe2x80x2CCCATTGATGAGCTCATAAAAGGAAGTTCT 3xe2x80x2 (Seq. ID No. 13)
And the I3 sequence of the N gene may be amplified with the following primer pair:
Primer 5 5xe2x80x2GTACAATAGCTTGAATTCTATTTTGTTG 3 (Seq. ID No. 14)
Primer 6 5xe2x80x2CTGTTTAGAACACAGACAGAATGAGAA 3xe2x80x2 (Seq. ID No. 15)
These primers are illustrative only; it will be appreciated by one skilled in the art that many different primers may be derived from the cDNA-N and N gene sequences in order to amplify particular regions of these molecules. Resequencing of PCR products obtained by these amplification procedures is recommended; this will facilitate confirmation of the amplified sequence and will also provide information on natural variation on the sequences in different ecotypes and plant populations.
PCR primers may also be designed having terminal restriction endonuclease sites to facilitate cloning of amplified products. Incorporation of the I3 sequence into the amplified cDNA-N may be achieved by making use of restriction sites within cDNA-N as described in Example 2 below. Similarly, regulatory sequences such as pN and 3xe2x80x2-GRS may be incorporated into the constructs using standard molecular biology techniques.
III. Obtaining N Homologs and Sequence Variants
The description of methods for producing cDNA-N/intron constructs above uses the example of the prototypical N gene and cDNA sequences. As discussed above, the terms N gene, cDNA-N, cDNA-N-tr, pN, I3 and 3xe2x80x2-GRS, and N and N-tr encompass not only the prototypical forms of these molecules but also homologs and variants that differ in exact sequence from the disclosed prototype sequences.
Homologs of the N gene are present in a number of plant species including tomato and other varieties of tobacco. Such homologs may also be used to produce cDNA-N/intron constructs. As described above, homologs of the disclosed N gene confer TMV resistance when introduced into otherwise susceptible plants and encode N and N-tr that are typically characterized by possession of at least 60% sequence identity counted over the full length alignment with the amino acid sequence of the prototype N and N-tr sequences using the NCBI Blast 2.0, gapped blastp set to default parameters. Proteins with even greater similarity to the reference sequences will show increasing percentage identities when assessed by this method, such as at least 70%, at least 75%, at least 80%, at least 90% or at least 95% sequence identity.
Both conventional hybridization and PCR amplification procedures may be utilized to clone sequences encoding N and N-tr homologs. Common to both of these techniques is the hybridization of probes or primers derived from the prototype cDNA-N or N gene sequence to a target nucleotide preparation, which may be, in the case of conventional hybridization approaches, a cDNA or genomic library or, in the case of PCR amplification, a cDNA or genomic library, or an mRNA preparation. Amplification of, or hybridization to, a cDNA library in order to obtain N homologs should preferably be performed on a cDNA library made from a plant infected with TMV so that the N homolog is actively expressed in the cells from which the library is made.
Direct PCR amplification may be performed on cDNA or genomic libraries prepared from the plant species in question, or RT-PCR may be performed using mRNA extracted from the plant cells using standard methods. PCR primers will comprise at least 15 consecutive nucleotides of the tobacco cDNA-N or N gene. One of skill in the art will appreciate that sequence differences between the tobacco cDNA-N or N gene and the target nucleic acid to be amplified may result in lower amplification efficiencies. To compensate for this, longer PCR primers or lower annealing temperatures may be used during the amplification cycle. Where lower annealing temperatures are used, sequential rounds of amplification using nested primer pairs may be necessary to enhance specificity.
For conventional hybridization techniques the hybridization probe is preferably conjugated with a detectable label such as a radioactive label, and the probe is preferably of at least 20 nucleotides in length. As is well known in the art, increasing the length of hybridization probes tends to give enhanced specificity. The labeled probe derived from the tobacco cDNA-N or N gene sequence may be hybridized to a plant cDNA or genomic library and the hybridization signal detected using means known in the art. The hybridizing colony or plaque (depending on the type of library used) is then purified and the cloned sequence contained in that colony or plaque isolated and characterized.
Homologs of the tobacco cDNA-N or N gene may alternatively be obtained by immunoscreening of an expression library. With the provision of the disclosed N gene and encoded proteins, the N or N-tr proteins may be expressed and purified in a heterologous expression system (e.g., E. coli) and used to raise antibodies (monoclonal or polyclonal) specific for the N or N-tr protein. Antibodies may also be raised against synthetic peptides derived from the tobacco N or N-tr amino acid sequences. Methods of raising antibodies are well known in the art and are described in Harlow and Lane (1988). Such antibodies can then be used to screen an expression cDNA library produced from the plant from which it is desired to clone the N homolog using routine methods. The selected cDNAs can be confirmed by sequencing and enzyme activity.
Variant N and N-tr proteins include proteins that differ in amino acid sequence from the prototypical N and N-tr sequences. Such proteins may be produced by manipulating the nucleotide sequence of the prototype N cDNAs or N gene using standard procedures such as site-directed mutagenesis or the polymerase chain reaction. The simplest modifications involve the substitution of one or more amino acids for amino acids having similar biochemical properties. These so-called conservative substitutions are likely to have minimal impact on the activity of the resultant protein. Table 1 shows amino acids that may be substituted for an original amino acid in a protein and which are regarded as conservative substitutions.
More substantial changes in biological function or other features may be obtained by selecting substitutions that are less conservative than those in Table 1, i.e., selecting residues that differ more significantly in their effect on maintaining (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a sheet or helical conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. The substitutions which in general are expected to produce the greatest changes in protein properties will be those in which (a) a hydrophilic residue, e.g., seryl or threonyl, is substituted for (or by) a hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue having an electropositive side chain, e.g., lysyl, arginyl, or histadyl, is substituted for (or by) an electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g., phenylalanine, is substituted for (or by) one not having a side chain, e.g., glycine. The effects of these amino acid substitutions or deletions or additions may be assessed for N derivatives by analyzing the ability of the N gene encoding the derivative proteins to confer TMV resistance on transgenic plants.
Variant N cDNAs or N genes may be produced by standard DNA mutagenesis techniques, for example, M13 primer mutagenesis. Details of these techniques are provided in Sambrook et al. (1989), Ch. 15. By the use of such techniques, variants may be created which differ in minor ways from the prototypical N nucleic acid sequences, yet which still retain N gene function. In their simplest form, such variants may differ from the disclosed sequences by alteration of the coding region to fit the codon usage bias of the particular organism into which the molecule is to be introduced.
Alternatively, the coding region may be altered by taking advantage of the degeneracy of the genetic code to alter the coding sequence in such a way that, while the nucleotide sequence is substantially altered, it nevertheless encodes proteins having amino acid sequences identical or substantially similar to the prototype N and N-tr sequences. For example, the second amino acid residue of the prototype N protein is alanine. This is encoded in the prototype N gene open reading frame (ORF) by the nucleotide codon triplet GCA. Because of the degeneracy of the genetic code, three other nucleotide codon tripletsxe2x80x94GCT, GCC and GCGxe2x80x94also code for alanine. Thus, the nucleotide sequence of the N ORF could be changed at this position to any of these three codons without affecting the amino acid composition of the encoded protein or the characteristics of the protein. Based upon the degeneracy of the genetic code, variant DNA molecules may be derived from the N cDNA and gene sequences using standard DNA mutagenesis techniques as described above, or by synthesis of DNA sequences.
Once a cDNA (or gene) encoding a protein involved in the determination of a particular plant characteristic has been isolated, standard techniques may be used to express the cDNA in transgenic plants in order to modify that particular plant characteristic. The basic approach is to clone the cDNA into a transformation vector, such that it is operably linked to control sequences (e.g., a promoter) that direct expression of the cDNA in plant cells. The transformation vector is then introduced into plant cells by one of a number of techniques (e.g., electroporation) and progeny plants containing the introduced cDNA are selected. Preferably all or part of the transformation vector will stably integrate into the genome of the plant cell. That part of the transformation vector which integrates into the plant cell and which contains the introduced cDNA and associated sequences for controlling expression (the introduced xe2x80x9ctransgenexe2x80x9d) may be referred to as the recombinant expression cassette.
Selection of progeny plants containing the introduced transgene may be made based upon the detection of an altered phenotype. Such a phenotype may result directly from the cDNA cloned into the transformation vector or may be manifested as enhanced resistance to a chemical agent (such as an antibiotic) as a result of the inclusion of a dominant selectable marker gene incorporated into the transformation vector.
Successful examples of the modification of plant characteristics by transformation with cloned nucleic acid sequences are replete in the technical and scientific literature. Selected examples, which serve to illustrate the knowledge in this field of technology include:
U.S. Pat. No. 5,571,706 (xe2x80x9cPlant Virus Resistance Gene and Methodsxe2x80x9d);
U.S. Pat. No. 5,677,175 (xe2x80x9cPlant Pathogen Induced Proteinsxe2x80x9d);
U.S. Pat. No. 5,510,471 (xe2x80x9cChimeric Gene for the Transformation of Plantsxe2x80x9d);
U.S. Pat. No. 5,750,386 (xe2x80x9cPathogen-Resistant Transgenic Plantsxe2x80x9d);
U.S. Pat. No. 5,597,945 (xe2x80x9cPlants Genetically Enhanced for Disease Resistancexe2x80x9d);
U.S. Pat. No. 5,589,615 (xe2x80x9cProcess for the Production of Transgenic Plants with Increased Nutritional Value Via the Expression of Modified 2S Storage Albuminsxe2x80x9d);
U.S. Pat. No. 5,750,871 (xe2x80x9cTransformation and Foreign Gene Expression in Brassica Speciesxe2x80x9d); and
U.S. Pat. No. 5,268,526 (xe2x80x9cOverexpression of Phytochrome in Transgenic Plantsxe2x80x9d).
These examples include descriptions of transformation vector selection, transformation techniques and the construction of constructs designed to over-express the introduced cDNA. In light of the foregoing and the provision herein of cDNA-N/intron constructs, it is thus apparent that one of skill in the art will be able to introduce these constructs into plants in order to produce plants having TMV resistance. Expression of cDNA-N/intron constructs in plants that are otherwise sensitive to TMV, will be useful to confer resistance to this and possibly other viruses.
A. Plant Types
Viruses infect many plant species, and TMV in particular is a serious pathogen of Solanaceous species such as tobacco (Nicotiana sp.), tomato (Lycopersicn sp.) and pepper (Capsicum sp.) and is able to infect potato (Solanum sp.). cDNA-N/intron 3 constructs as described herein are expected to be effective against not only TMV, but also other viruses, including other Tobamoviruses. Closely related Tobamoviruses include tomato mosaic virus and pepper green mottle virus, and it is known that expression of the N gene in tomato confers resistance to tomato moaic virus. Thus, cDNA-N/intron constructs may be usefully expressed in a wide range of higher plants to confer resistance to viral diseases, both monocotyledonous and dicotyledenous plants, including, but not limited to maize, wheat, rice, barley, soybean, cotton, beans in general, rape/canola, alfalfa, flax, sunflower, safflower, brassica, cotton, flax, peanut, clover; vegetables such as lettuce, tomato, cucurbits, cassava, potato, carrot, radish, pea, lentils, cabbage, cauliflower, broccoli, Brussels sprouts, peppers; tree fruits such as citrus, apples, pears, peaches, apricots, walnuts; and flowers such as orchides, carnations and roses.
B. Vector Construction
A number of recombinant vectors suitable for stable transfection of plant cells or for the establishment of transgenic plants have been described including those described in Pouwels et al., (1987), Weissbach and Weissbach, (1989), and Gelvin et al., (1990). Typically, plant transformation vectors include one or more cloned plant genes (or cDNAs) under the transcriptional control of 5xe2x80x2 and 3xe2x80x2 regulatory sequences and a dominant selectable marker. The selection of suitable 5xe2x80x2 and 3xe2x80x2 regulatory sequences for the cDNA-N/intron constructs is discused above. Dominant selectable marker genes that allow for the ready selection of transformants include those encoding antibiotic resistance genes (e.g., resistance to hygromycin, kanamycin, bleomycin, G418, streptomycin or spectinomycin) and herbicide resistance genes (e.g., phosphinothricin acetyltransferase).
C. Transformation and Regeneration Techniques
Transformation and regeneration of both monocotyledonous and dicotyledonous plant cells is now routine, and the appropriate transformation technique will be determined by the practitioner. The choice of method will vary with the type of plant to be transformed; those skilled in the art will recognize the suitability of particular methods for given plant types. Suitable methods may include, but are not limited to: electroporation of plant protoplasts; liposome-mediated transformation; polyethylene glycol (PEG) mediated transformation; transformation using viruses; micro-injection of plant cells; micro-projectile bombardment of plant cells; vacuum infiltration; and Agrobacterium tumefaciens (AT) mediated transformation. Typical procedures for transforming and regenerating plants are described in the patent documents listed at the beginning of this section.
D. Selection of Transformed Plants
Following transformation and regeneration of plants with the transformation vector, transformed plants are usually selected using a dominant selectable marker incorporated into the transformation vector. Typically, such a marker will confer antibiotic resistance on the seedlings of transformed plants, and selection of transformants can be accomplished by exposing the seedlings to appropriate concentrations of the antibiotic.
After transformed plants are selected and grown to maturity, they can be assayed using the methods described herein to determine whether the susceptibility of the plant to TMV infection has been altered as a result of the introduced cDNA-N/intron construct.