The invention relates to the isolation of molecular markers of genetic mutation in plants by representational difference analysis (RDA). In particular, the invention relates to the preparation of genetic probes for early identification of plants that are not true-to-type. More particularly, the invention relates to the use of such genetic probes as a diagnostic and quality control tool for monitoring the development of genetic polymorphisms arising during tissue culture regeneration of plants.
The large-scale production of commercially elite plants by in vitro micropropagation is technically available for a number of species. A common problem encountered when growing plants by tissue culture is the development of tissue culture-induced genomic polymorphisms in which genetic changes in the nuclear, mitochondrial and/or chloroplast genomes result in a lack of homogeneity among the regenerants and the production of inferior plants that are not true-to-type (i.e., xe2x80x9coff-typexe2x80x9d plants) with little commercial value. The development of genetic polymorphisms in tissue culture is termed xe2x80x9csomaclonal variationxe2x80x9d. As used herein, the term xe2x80x9coff-typexe2x80x9d refers to a plant that exhibits a phenotypic difference from a normal plant.
Investment in plants that are later discovered to be off-types can have severe financial implications for both growers and plant producers, who may only discover that the plants are off-types after the plants have grown for some period of time in the field. Traditionally, an experienced examiner is required to identify off-type plants by a morphological description and visual monitoring of plant characteristics. However, many characteristics are only expressed in a more mature stage of plant development and not in in vitro plants or very young plants. For example, in banana plants, off-types such as dwarfs, mosaics (which have irregular bright yellow spots or stripes on the leaf) and masadas (which have abnormal foliage showing depressions and thicker leaves) are difficult to identify at both the tissue culture and nursery stages [Israeli et al. (1991) Scientia Hortic. 48, 71-88]. Therefore, a method for early monitoring and identification of off-type plants, while still at the tissue culture or nursery stages, would be a very advantageous tool to enable the selection of desired true-to-type plants for further propagation.
Attention was first drawn to somaclonal variation by Larkin and Scowcroft [Theoret. Appl. Genet. 60,197-24 (1981)] and a great deal of literature on the subject has accumulated [see reviews by M. Lee and R. L. Phillips, Ann. Rev. Plant Physiol. Mol. Biol. 39, 413-437 (1988); R. L. Phillips et al., in Progress in Plant Cellular and Molecular Biology, Kluwer Academic Publishers, Netherlands, pp. 136 (1990); C. A. Cullis, The Molecular Biology of Plant Cells and Cultures in Comprehensive Biotechnology, M. W. Fowler and G. S. Warren, eds., Pergarmmon Press, NY (1991); A. Karp, Oxford Surveys Plant Mol. Cell. Biol 7, 1-58 (1991)]. A large number of variants in morphological and biochemical characteristics, as well as in changes at the DNA level, have been characterized. At the genomic level, changes in ploidy level, chromosomal rearrangements, activation of transposable elements, gene amplification, single gene mutations and variation in quantitative traits have been reported. Variations in mitochondrial and chloroplast genomes in regenerants, particularly in cereals, have also been reported [M. D. Morere-Le Pavan et al. Theoret. Appl. Genet. 85, 1-19 (1992)], thus illustrating that all compartments of the genetic information of the plant cell are susceptible to the phenomenon of somaclonal variation. In addition to the alterations in genomic DNA sequence and organization, stable changes in nucleic acid methylation patterns have been implicated as a major factor in epigenetic changes [S. M. Kaeppler and R. L. Phillips, Proc. Natl Acad. Sci. USA 90, 8773-8776 (1993); M. J. M. Smulders et al., Theoret. Appl. Genet 91, 1257-1264 (1995); B. Arnholdt-Schmitt et al., Theoret. Appl. Genet. 91, 809-815 (1995); P. Bogani et al., Genome 38, 901-912 (1995)]. The question of how and when during the tissue culture process variations occur has also been addressed. A study in wheat indicated that the variation in these cultures originated during the callus phase and that the extent of variation could be affected by manipulation of the culture medium [B. F. Carver and B. B. Johnson, Theoret. Appl. Genet. 78, 405-418 (1989)].
A series of PCR (polymerase chain reaction) based technologies have been used by investigators to compare and highlight the differences between DNAs isolated from related sources. For example, restriction fragment length polymorphisms (RFLP) have been used, particularly by plant breeders, as genetic markers in developing genetic linkage maps in which chromosomal regions associated with desirable phenotypic traits may be identified and tracked during subsequent selective breeding to produce improved plant lines. RFLPs are genetic differences detectable by DNA fragment lengths, typically revealed by agarose gel electrophoresis after restriction endonuclease digestion of DNA. There are large numbers of restriction endonucleases available, characterized by their nucleotide cleavage sites and their source, e.g., the bacteria E. coli. Variations in RFLPs result from nucleotide base pair differences which alter the cleavage sites of the restriction endonucleases, or by insertions, yielding different sized fragments. Other point mutations in the genome usually go undetected. Thus, RFLP differences often are difficult to identify. Although RFLP has advantages in detecting genetic variation, it is labor intensive.
Sequence tagged sites (STS) of DNA polymorphisms have been developed by the use of RFLP. However, the development of STS requires an already identified difference which can be tested for in the unknown cell lines. Thus, STS is not a useful approach to isolate differences between uncharacterized cultivars.
Random amplified polymorphic DNAs (RAPD) and amplified fragment length polymorphisms (AFLP) are two further techniques that simply compare the DNA from any number of different samples and can be used to detect the level of difference between them. The RAPD method employs DNA amplification by PCR using short primers of arbitrary sequence (random amplified polymorphic DNA). Differences as small as single nucleotides between genomes can affect the RAPD primer""s binding/target site, and a PCR product may be generated from one genome but not from another. RAPD detection of genetic polymorphisms represents an advance over RFLP in that it is less time consuming, more informative, and readily adaptable to automation. However, RAPD is limited in that only dominant polymorphisms can be detected (i.e., this method does not offer the ability to examine simultaneously all the alleles at a locus in a population). However, because of its sensitivity for the detection of polymorphisms, RAPD method has been widely used for analyzing genetic variation within species or closely related genera, both in the animal and plant kingdoms. In particular, RAPD has been used by several groups for off-type plant detection and, recently, a potential RAPD marker has been identified for a dwarf banana off-type [O. Damasco et al., Plant Cell Rep. 16, 118-122 (1996)]. However, this use of the RAPD technique was restricted to attempting to generate a marker for dwarfism, with no reference made to attempting to find generalized markers of polymorphism. Both RFLP and RAPD have been used to distinguish between regenerants from embryogenic carrot cell lines [L. Georgetti et al., Mol. Gen. Genet. 246, 657-662 (1995)]. The RAPD technique however, generally has several disadvantages, including a lack of reproducibility of results and the necessity of using of a large number of different primers to detect variation in only a small portion of the genome.
AFLP is similar in concept to RFLP in that restriction enzymes are used to specifically digest the genomic DNA to be analyzed. The primary difference between RAPD and AFLP is that the amplified restriction fragments produced in AFLP are modified by the addition of specific, known adaptor sequences which serve as the target sites for PCR amplification with adaptor-directed primers. In both RAPD and AFLP, however, only those differences specific to a particular primer, or primer set, are detected in any one reaction. Therefore, if the material is only different at a few sites within the genome (that is, the samples are closely related) a large number of primers must be used in order to detect variation. For example, in experiments we conducted with flax, the use of 300 different RAPD primers only covered about one percent of the genome.
Another known PCR-based technology is simple sequence repeat polymorphisms (SSR). The SSR method of assaying polymorphisms involves utilizing the high degree of length variation resulting from certain repeating nucleotide sequences (simple sequence repeats) found in most genomes. SSR polymorphisms can be detected by PCR using minute amounts of genomic DNA and, unlike RAPDs, they can detect a high degree of genetic polymorphism. Although SSR has been used successfully for comparative analysis and mapping of mammalian and plant genomes, there are practical drawbacks to the method. The markers generated by the method are obtained by first constructing a genomic library, screening the library with probes representing the core elements of a particular repeat sequence, purifying and sequencing the positive clones, and synthesizing the primers specific for the flanking sequences for each cloned SSR locus. Genomic DNA is then amplified to screen for polymorphisms, and mapping of the genome is then carried out. The entire process is time consuming, expensive and technically demanding.
An alternative method for detecting nucleic acid sequences present in one but absent from another population of otherwise similar nucleic acid sequences is the technique of subtractive hybridization. For the purposes of this technique, the two genomic populations are called tester DNA and driver DNA, respectively. The basic rationale of subtractive hybridization is to compare two DNAs by using the driver DNA in excess during hybridization with the tester DNA to remove (subtract) all of the sequences held in common between the two DNA samples. Therefore, what are left are those sequences which vary between the two DNA samples. The technique thus enriches for a set of (target) sequences that are unique to the tester DNA.
In one reported method of subtractive hybridization, a physical difference (e.g., a label, such as biotin) between the driver and tester DNAs is introduced prior to allowing the two DNAs to hybridize. The desired (unique) tester sequences are segregated from the unwanted (common) tester sequences by a strategy in which, during the hybridization step, driver DNA is provided in excess over the tester DNA so that most of the sequences common to tester and driver populations form tester-driver duplexes. Thus, sequences common to both populations segregate with the driver DNA when the physical difference is exploited after the hybridization step to separate the tester from the driver DNA (e.g., biotin containing duplexes are removed by binding to strepavidin coated beads). In the simplest form of subtractive hybridization, driver DNA is prepared for hybridization by methods that produce random ends (e.g., sonication or mechanical shearing), while tester is prepared by restriction endonuclease digestion that facilitates its later ligation into cloning vectors.
Representational Difference Analysis (RDA) belongs to the general class of subtractive methodologies in which subtractive hybridization and PCR are combined, and is a reliable way to detect differences between two complex genomes. [Lisitsyn et al., Science 259, 946 (1993); Lisitsyn et al., Nature Genetics 6, 57-63 (1994); U.S. Pat. No. 5,436,142]. RDA does not require the use of any label, such as biotin, nor any post-reassociation physical separation techniques. However, for complex DNAs, such as mammalian DNAs having about 109 base pairs (bp), it is necessary to reduce the complexity of the hybridizing mixture for the subtractions to be efficient. The reduction in complexity can be achieved by any method which reproducibly generates a subpopulation of the genome. Lisitsyn et al. prepared representations (amplicons) of the two genomes to be compared by digesting the genomic DNAs with a restriction endonuclease that recognizes a six-base sequence, and using PCR to amplify the total digestion product after ligating a universal adaptor. Restriction fragments whose sizes and sequences were suitable for PCR amplification were enriched in the amplicons, and other fragments remained unamplified. This procedure resulted in a population of relatively short fragments that represented substantially fewer sequences than were present in the initial DNA populations. The term xe2x80x9crepresentationalxe2x80x9d in RDA, therefore, refers to the production of a reproducible subpopulation of DNA fragments having a complexity that may be only between 1% and 12% of the starting population. Thus, a disadvantage of the representation step of RDA is that only a very small portion of the entire genome is tested and only some part of all the differences existing between the two DNAs will be detected.
The afore-mentioned technologies for comparing differences between DNAs isolated from related sources have been used to isolate or track phenotypically-expressed variant-specific markers in plant genomes, such as markers for dwarfism in bananas, galactinol synthase in soybean seed and zucchini leaf, height in tomatoes, xcex2-ketoacyl-ACP synthetase II in soybean, cotton, tomato and tobacco, high seed oil production in sunflowers, brown stem rot resistance in soybeans, and the like. However, there are no known reports of employing any of these technologies to isolate and identify a set of genetic markers that can be used to detect and/or identify genomic polymorphisms as they arise during the course of plant tissue culture.
The present invention takes advantage of the strategy of representational difference analysis (RDA) in a method for obtaining molecular markers for use as a diagnostic and quality control tool to identify genomic polymorphisms that arise during the in vitro process of tissue culture of vegetatively propagated plants. The invention is based on the premise that there is a labile fraction of the plant genome which is altered whenever a somaclonal variant is observed. That is, that there are sites within the genome of normal plants which are especially labile and may be altered due to environmental stresses, particularly stresses induced by tissue culture, often without any specific phenotypic mutation being observable in the resulting plants. Thus, genomic polymorphisms may occur that are xe2x80x9csilentxe2x80x9d in the regenerated plant, but are indicative of changes which have occurred due to environmental stresses.
In one embodiment, the invention provides a method for identifying genomic polymorphisms arising during the process of tissue culture of plant cells, comprising the steps of isolating DNA from (i) a normal plant, and (ii) an off-type plant of the same species; performing RDA to obtain a DNA subtraction product representing a genetic difference between the off-type plant DNA and the normal plant DNA; isolating DNA from a sample of plant cells in tissue culture; and using the DNA subtraction product to identify a putative DNA sequence difference between the DNA from the sample of plant cells in tissue culture and the DNA from the normal plant; wherein, if no DNA sequence difference is identified, the method further comprises repeating the identification step on further samples of plant cells isolated at subsequent time intervals during the tissue culture process and repeating the identification using the DNA subtraction product until a DNA sequence difference is identified or until the tissue culture process is completed.
The method preferably further comprises the step of amplifying the DNA subtraction product to obtain a probe comprising a nucleic acid sequence containing the genetic difference, for use in the identification step. By the method of the invention, a plurality of DNA subtraction products may be isolated from a single off-type plant and/or from a plurality of off-type plants, and individual probes obtained that form a library of genetic difference probes for use in identifying somaclonal variants.
The identification of a genetic polymorphism arising in a tissue culture may be indicative of culture conditions that are environmentally stressful to the cells. Thus, if such a genetic polymorphism is identified during tissue culture, the method preferably further comprises the step of adjusting tissue culture conditions (e.g., nutrient enrichment, pH adjustment, hormone type and/or concentration adjustment, cytokinin type and/or concentration adjustment, and the like) to optimize the conditions to alleviate environmental stresses and to prevent the occurrence of further somaclonal variation in the culture. Thus, an advantage of the method of the invention is that it affords the opportunity to develop criteria for optimum tissue culture conditions for propagating a given plant cell species or cultivar in vitro, and to optimize plant multiplication rates without producing a significant number of off-types.
In another embodiment of the invention, a method is provided for isolating markers of genomic integrity of plants. As used herein, xe2x80x9cmarkers of genomic integrityxe2x80x9d are intended to mean normal nucleic acid sequences that have not undergone somaclonal variation. Thus, in this embodiment of the method, RDA is performed using an off-type plant DNA as driver DNA and a normal plant DNA as tester DNA, to obtain a DNA subtraction product representing a DNA sequence present in the normal plant and not present in the off-type plant. Preferably, the driver DNA is from a pool of off-type plants representing a plurality of different phenotypic mutations, in order to isolate a plurality of DNA sequences present in the normal plant that may show particular lability due to environmental stresses encountered in tissue culture. These DNA subtraction products may then be used as probes to monitor the genetic stability of plant cells during the process of tissue culture. Moreover, this embodiment of the method of the invention provides markers for genomic integrity that are useful in selecting particular individual plants for use as founding members of breeding lines, especially in the generation of transgenic lines.
In other embodiments, the invention provides methods for isolating one or more markers of representing genomic alterations that are common to a plurality of different off-type plants, and markers that represent specific genomic alterations associated with a particular phenotypic mutation. An advantage of these methods of the invention is that markers are provided that may be used to identify a gene or combination of genes in the normal genome that code for specific phenotypes, and sequence tagged sites may be developed. Moreover, by the methods of the invention, difference products may be isolated by RDA between plants exhibiting desirable traits and plants exhibiting undesirable traits. Genetic markers thus isolated will represent specific phenotypic mutations, which may be used to identify and isolate genes for desirable traits, such as disease or pest resistance, for use in generating transgenic plant lines having such desirable traits.
The methods of the invention for identifying and monitoring genomic polymorphisms arising during tissue culture allow early identification of regenerant plants that may remain phenotypically xe2x80x9cnormalxe2x80x9d but still express variant nucleic acid sequences on a molecular level. The invention methods further allow early identification of regenerant plants that express variant nucleic acid sequences associated with a specific mutation, prior to the actual phenotypic expression of the mutation. Moreover, by the methods of the invention, the degree to which a phenotypic variation may be expressed in the regenerants from the plant tissue culture can be estimated (e.g., as a percentage). The invention thus provides two different types of molecular markers, one of which is indicative of variations occurring progressively from very early to late time stages of tissue culture, and the second which is specific to a particular phenotypic off-type. Plants which have xe2x80x9csilentxe2x80x9d polymorphisms, but are phenotypically normal, are usually commercially acceptable, and tissue cultures containing such plant cells are not necessarily discarded; but the presence of these polymorphisms may indicate unstable culture conditions which may require adjustment, as described above. The identification during tissue culture of a genetic polymorphism associated with a marker specific to an undesirable off-type plant is likely to result in discard of the culture. Thus, the methods of the invention provide, not only a diagnostic tool, but a quality control tool for micropropagation of plants.
In addition to the advantages of the invention described above, the characterization of genes and the genomic locations involved in the generation of somaclonal variations is expected to aid in the understanding the genomic mechanisms involved in control of a rapid genomic response to environmental stresses in plants.