Since ancient times, mankind has been trying to improve the quality and yield of crops. In the 20th century, plant breeders developed mutation breeding as a tool for crop improvement. Breeders painstakingly examined progeny plants or seeds from mutagenized source material in hopes of finding lines with superior yield, grain quality, pest and stress resistance, and the like. Although these efforts had some degree of success (for example the dwarfing mutations that contributed to the Green Revolution of the 1960's), there was often little or no understanding of the underlying genetic changes that contributed to the improved traits. Success more or less reflected the random nature of mutagenesis and the ability of the investigators to recognize a plant with improved characteristics hidden among relatively large populations.
The recent convergence of multiple disciplines, such as molecular genetics, biochemistry and information science, has created a virtual explosion in the understanding of genes and their functions. The genomes of many organisms, including the plants Arabidopsis and rice, have been sequenced in their entirety; multiple varieties of transgenic crops with improved traits are now on the market or in development. However, given the cost of development and registration, as well as political opposition in some quarters, transgenic technology may not always provide the best solution for the goals of crop improvement. Although traditional crop breeding has benefited from the newer technologies, particularly in the areas of marker-assisted breeding and the identification of multiple loci in quantitative traits, classical mutation breeding has seen relatively few changes. By combining the expanding knowledge of gene fimction, the tools of molecular biology, and the techniques of classical mutagenesis, it is possible to create novel, non-transgenic approaches to crop improvement. Despite this potential, relatively few advances in that regard have been made thus far.
Although the entire genomes of some plants and other organisms have been sequenced, another great challenge that remains is identifying the function of genes that have not yet been characterized beyond the sequence level. One approach for identifying a gene's function is to “knock-out” the gene and observe the effect(s) this has on the plant. However, there are some limitations to the currently available techniques available for such approaches. For example, RNAi can be used in an attempt to silence a specific plant gene, but this “silencing” is often partial. Thus, it can be difficult to assess the effects of a gene “silenced” in this manner if partial expression remains.
A gene can also be “knocked out” by insertion of T-DNA or a transposable element into the gene or its regulatory region. Libraries of insertion mutants or lines can be generated (with each affecting a certain gene or genes), but a tremendous number of such insertion lines are typically required to span an entire genome. Creating insertions in small genes, or obtaining lines with insertions in two or more closely-linked genes, is also especially difficult or impossible. Furthermore, despite the fact that many of these techniques can be readily applied in the model plant Arabidopsis, and to a lesser extent in a limited number of other species such as maize, their utility in most crop species is often severely restricted or nonexistent.
U.S. Pat. No. 5,994,075 relates generally to methods for identifying a mutation in a gene of interest without a phenotypic guide. Some methods of inducing mutations in organisms and screening those organisms for mutations in genes of interest are known (Ballinger and Benzer S, 1989 Proc Natl Acad Sci U S A 86:9402-9406, Zwaal et al. 1993 Proc Natl Acad Sci USA 90:7431-7435), but those methods all have various limitations. For example, the presence of a T-DNA or transposon insertion (mentioned above) can be detected by polymerase chain reaction (PCR), which is a well-known technique that can be used for amplifying a targeted genetic region. However, transgenic and endogenous elements such as these are not widely available, and the techniques that use them have low detection sensitivity or require multiple screens to recover mutations.
Libraries of mutants can be generated in many ways, with the goal being mutants that span the genome. Various mutagens can be used to cause deletion mutants or other deleterious mutations. Point mutations can be difficult to screen and identify, although some techniques are reportedly available for such purposes. See, e.g., McCallum, C. M., L. Comai, E. A. Grene and S. Henikoff, “Targeting Induced Local Lesions in Genomes (TILLING) for plant functional genomics,” Plant Physiology (June 2000), 123(2):429-442. See also WO 01/75167. PCR and sequencing of the amplicon is another technique, but this is obviously laborious and not amendable to high throughput.
Another approach for identifying mutations of interest is to use peptide nucleic acid (PNA) probes designed to target a certain sequence (typically of 18 residues or fewer) where a point mutation might occur. PNAs are nucleic acid analogues that can be designed to selectively bind conventional nucleic acids of complementary sequence to form hybrids that are more stable against dehybridisation by heat than are similar hybrids between conventional nucleic acids.
As explained in U.S. Pat. No. 5,891,625, a PNA probe can be designed as a diagnostic to bind strongly to a particular gene of a healthy individual but, in the case of a mismatch in the gene, lacks stable binding in individuals having a mutation in the gene. Thus, in a healthy individual, the PNA probe binds strongly to the gene and is effective to block PCR directed to that gene. On the other hand, the PNA probe will not maintain hybridization with an oncogenic mutation, allowing PCR amplification (with a resulting observable band) to proceed, thereby resulting in a PCR product that signals a dectectable oncogenic mutation. Alternatively, or in addition, such a PNA probe may be labeled, whereby the presence or absence of a label signals the absence or presence of a mutation.
DE 19733619 relates to the diagnosis of malignant tumors and to methods of assaying a small tissue sample from a known individual. The methods described therein generally involve the use of a PNA probe to detect oncogenic gene mutations. More specifically, the methods comprise: performing PCR using a complementary wild type analog PNA oligonucleotide which suppresses the amplification of surplus wild type alleles along with an oligodeoxynucleotide primer pair; and identifying the mutations or variations using PCR-RFLP (restriction fragment length polymorphism) and a known sequence for a restriction enzyme carrying oligonucleotide. When used for cancer detection, the PNA probe blocks PCR amplification of a sample from a cancer-free individual but permits PCR amplification of DNA from oncogenic cells having the known mutation(s). This method is said to be an improvement over PNA-mediated PCR clamping. Various limitations of PCR clamping are discussed in this reference, which adds a second step (PCR-RFLP) as an improvement
The above-described PNA procedures, however, do not involve or suggest pooling DNA or using DNA samples from multiple sources. While these PNA procedures may be suitable for detecting point mutations in a given individual, different considerations are involved when screening large numbers of samples from multiple sources for unknown mutations (or deletions). This latter would be the case for screening a large collection of plants (1000+) that were subject to random mutagenesis. In this regard, screening an individual for cancer is quite different from screening large numbers of mutated plants, for example. Because the rate of mutation resulting from treatment with chemical mutagens and the like is relatively very low, high-throughput methods are needed in this context to screen large numbers of plants for unknown mutations.
PCR-based techniques have been developed to screen pooled samples for deletion mutants; however, this art has consistently taught that the extension time in the PCR procedure (i.e., the length of time that the polymerase is allowed to extend the DNA strand) must be shortened so as to preferentially amplify the shorter product from the deletion mutant but not the longer wild-type PCR product. In high-throughput versions of such screenings, samples from hundreds of mutated plants, for example, are pooled, and the pool is subsequently screened for the presence of a mutant. With this number of amplifications in mind, it is understandable that those in the art perceived it necessary to suppress the signal from the predominant wild-types by limiting the extension step of the PCR. See, e.g., U.S. Pat. No. 6,484,105; WO 98/50539; U.S. Pat. No. 6,358,690; WO 99/51774, U.S. Pat. No. 5,994,075; Xin Li et al. (The Plant Journal (2001), 27(3), 235-242), and Li & Zhang (Funct. Integr. Genomics (2002) 2:254-258). Many of these references involve attempts to identify the function of unknown genes having deletions therein.
Another limitation to these PCR techniques is that they are not sensitive enough to detect small deletions. That is, the PCR amplicon of a deletion mutant missing only 100 or so basepairs would not have a noticeably different band (as compared to the wild-type amplicon) on a typical gel (having resolving power to about 600 basepairs).
Edgley et al. note that only a small fraction of the sequenced nematode genes have been mutagenized. Nucleic Acids Research, 2002, Vol. 30, No. 12, e52. Edgely et al. attempt to “knock out” additional genes of nematodes (Caenorhabditis elegans) to create larger libraries to study the function of these genes, and to possibly find corresponding genes in humans. Edgley et al. used trimethylpsoralen (TMP)/ultraviolet light (UV) mutagenesis, which is reported to typically produce deletions in the range of 50-600 basepairs. TMP[UV mutagenesis is more suited for mutagenizing nematodes than plants, due to the inability of UV to penetrate plant seeds for example. This reference describes a technique using PCR and a third primer between the two external PCR primers to amplify DNA pooled from various nematodes mutated with TMP[UV. This technique was based on “nested PCR,” which uses one set of primers in an initial round of PCR, followed by a second round of PCR using primers just inside of the first primers. The second step, which makes use of the nested primers, is performed so as to virtually eliminate the chances that a non-target amplicon produced in the first round would also be amplified in the second round. In the Edgley et al. approach, a third primer is used in the first round of PCR, wherein the third primer binds (in wild types) between the first set of primers. For wild-type templates, the third primer inhibits amplification between the two external primers, but allows amplification between the third primer and one of the external primers to occur. Thus, for wild type DNA, there is only amplification of a relatively short amplicon, which lacks a binding site for one of the nested primers for the second round.) However, if a deletion mutation removes the binding site of the third primer, PCR amplification between the two external primers occurs, resulting in a long amplicon. In the second round of PCR using the nested primers, only amplification of the long amplicon (containing the deletion mutation) occurs, thus signaling the presence of a deletion. The Edgely et al. approach is used for detecting small deletions in relatively short PCR amplicons in an organism amenable to mutagenesis methods that produce such deletions.