The present invention is directed to germplasm containing an identifier nucleotide sequence and to a method for identifying germplasm, more particularly the source of germplasm. The present invention is especially useful for determining the ownership of germplasm, including plants and animals, or the ownership of germplasm used as a parent in the development of germplasm. Briefly, the invention is directed to the production of transgenic organisms which contain an identifier nucleotide sequence within the organellar genome, including but not limited to plastids and mitochondria. The identifier nucleotide sequence is selected such that it is capable of indicating the source of the germplasm. The organellar genome of germplasm to be tested is isolated and analyzed for the presence of the identifier nucleotide sequence. The presence of the identifier nucleotide sequence establishes the source of the germplasm.
The publications and other materials used herein to illuminate the background of the invention, and in particular, cases to provide additional details respecting the practice, are incorporated by reference, and for convenience are referenced in the following text by author and date and are listed alphabetically by author in the appended bibliography.
The classification of organisms has traditionally been done along more or less arbitrary and some-what artificial lines. For example, the living world has been divided into two kingdoms, namely plants and animals. This classification works well for generally familiar organsims, but becomes more difficult for such organisms as unicellular ones. Where classification of organisms becomes more than a scientific exercise is in the identification of plants and animals for hybridization and breeding programs, and in accurate and reliable identification of microorganisms which may infect plants and animals. For example, the plant breeder may wish to have a quick and reliable means of identifying different species and strains for use in their breeding programs. In addition, the owner of a particular plant variety may wish to have a quick and reliable means of identifying his variety for establishing derivation or ownership of germplasm. Thus, the correct identification of species or varieties of organisms is of particular importance.
Biological samples have been collected and analyzed for a variety of reasons relating to agricultural forensic, e.g., to identify individuals for population genetics or variety derivation. The known methods of analyzing biological samples from plants include using the proteins, specific enzymes or the nucleic acids to identify individually specific patterns within a species. For example, protein from a plant can be isolated and subjected to a two-dimensional electrophoretic analysis to produce a protein fingerprint (Anderson et al., 1985; Choi et al., 1984). Alternatively, isozymes, i.e., specific enzymes which exist in several forms in plants, can be isolated and subjected to electophoresis to produce an isozyme fingerprint (Nielsen, 1985; Shields et al., 1983). Finally, the nucleic acids can be isolated and analyzed to produce a DNA fingerprint (U.S. Pat. No. 5,674,687; U.S. Pat. No. 4,963,663; Helentjaris et al., 1985; Landry et al., 1985; Jeffreys et al., 1985). DNA fingerprinting has utilized restriction fragment length polymorphisms and the use of minisatellite and microsatellite probes and/or amplification primers to obtain highly specific patterns which are useful to identify individuals within a known species or in pedigree studies to determine lineage. DNA fingerprinting is generally useful for identifying individuals because it uses patterns that are individually-specific and complex. Similarly, two-dimensional protein fingerprint could be used to identify individuals on the basis of complex patterns. An isozyme fingerprint is less complex but also has less resolution power.
Although each of these methods or a combination of these methods can be used to identify individuals, they have several disadvantages including: (a) time-consuming, (b) complex, sometimes with many steps, (c) requiring skilled and knowledgeable technicians, (d) non-standardization with results varying from lab to lab and test to test and (e) inaccurate measurements often with subjective interpretation. Thus, there is a definite need for a method to identify the source or parentage of germplasm which is easy to perform, relatively fast, does not require significant expertise and requires no subjectivity of interpretation. The present invention solves this need as illustrated herein.
The present invention is directed to germplasm containing an identifier nucleotide sequence and to a method for identifying germplasm, more particularly the source or parentage of germplasm. The present invention is especially useful for determining the ownership of germplasm, including plants and animals, or the ownership of germplasm used as a parent in the development of germplasm. Briefly, the invention is directed to the production of transgenic organisms which contain an identifier nucleotide sequence within the organellar genome. The identifier nucleotide sequence is selected such that it is capable of indicating the source of the germplasm. The organellar genome of germplasm to be tested is isolated and analyzed for the presence of the identifier nucleotide sequence. The presence of the identifier nucleotide sequence establishes the source or parentage of the tested germplasm.
The present invention avoids the problems of prior art fingerprinting techniques for the identification of the source of germplasm. The present invention is simpler and easier to use than prior art fingerprinting techniques and requires no subjective interpretation of the results of the method.
The present invention is directed to germplasm containing an identifier nucleotide sequence and to a method for identifying germplasm, more particularly the source or parentage of germplasm. The present invention is especially useful for determining the ownership of germplasm, including plants and animals, or the ownership of germplasm used as a parent in the development of germplasm. Briefly, the invention is directed to the production of transgenic organisms which contain an identifier nucleotide sequence within the organellar genome. The identifier nucleotide sequence is selected such that it is capable of indicating the source of the germplasm. The organellar genome of germplasm to be tested is isolated and analyzed for the presence of the identifier nucleotide sequence. The presence of the identifier nucleotide sequence establishes the source or parentage of the tested germplasm.
In order to more fully understand the present invention, the following definitions are provided.
xe2x80x9cAmplification of Polynucleotidesxe2x80x9d utilizes methods such as the polymerase chain reaction (PCR), ligation amplification (or ligase chain reaction, LCR) and amplification methods based on the use of Q-beta replicase. Also useful are strand displacement amplification (SDA), thermophilic SDA, nucleic acid sequence based amplification (3SR or NASBA) and repair chain reaction (RCR). These methods are well known and widely practiced in the art. See, e.g., U.S. Pat. Nos. 4,683,195 and 4,683,202 and Innis et al., 1990 (for PCR); Wu et al., 1989 and EP 320,308A (for LCR); U.S. Pat. Nos. 5,270,184 and 5,455,166 and Walker et al., 1992 (for SDA); Spargo et al., 1996 (for thermophilic SDA) and U.S. Pat. No. 5,409,818, Fahy et al., 1991 and Compton, 1991 for 3SR and NASBA. Reagents and hardware for conducting PCR are commercially available. Primers useful to amplify the identifier nucleotide sequence are preferably complementary to, and hybridize specifically to the identifier nucleotide sequence or to regions that flank a target region therein.
xe2x80x9cGermplasmxe2x80x9d refers to species or variety of a plant or an animal.
xe2x80x9cIdentifier nucleotide Sequencexe2x80x9d refers to a unique nucleic acid which is not present either in the organellar genome(s) or in both the nuclear and organellar genomes. A nucleotide sequence can be determined to be unique by, for example, using it as a probe in a hybridization assay of the organellar DNA or the total DNA of the germplasm of interest. The identifier nucleotide sequence is capable of identifying the source of germplasm.
xe2x80x9cPrimerxe2x80x9d refers to a sequence comprising about 8 or more nucleotides, preferably about 15 or more nucleotides which forms a stable hybrid with its complentary sequence and which can be used to amplify the identifier nucleotide sequence. Primers for an identifier nucleotide sequence may be derived from the sequence of the ends of the identifier nucleotide sequence or the complement thereof or may be derived from the sequence of organellar nucleic acid (or its complement) adjacent to the point of insertion of the identifier nucleotide sequence. If the target sequence contains a sequence identical to that of the probe, the primers may be short, e.g., in the range of about 8-30 base pairs, since the hybrid will be relatively stable under even highly stringent conditions. Of course longer primers, up to 100 base pairs or more, may also be employed which hybridize to the target sequence with the requisite specificity.
xe2x80x9cProbexe2x80x9d refers to a polynucleotide which forms a stable hybrid with that of the target sequence, under highly stringent to moderately stringent hybridization and wash conditions. It is preferred that the probes will be perfectly complementary to the target sequence, such that high stringency conditions will be used. Hybridization stringency may be lessened if some mismatching is used, for example, if a probe which is not completely complementary is utilized. Conditions are chosen which rule out nonspecific/adventitious bindings, that is, which minimize noise. (It should be noted that throughout this disclosure, if it is simply stated that xe2x80x9cstringentxe2x80x9d conditions are used that is meant to be read as xe2x80x9chigh stringencyxe2x80x9d conditions are used.)
Probes for an identifier nucleotide sequence may be derived from the sequence of the identifier nucleotide sequence or the complement thereof. The probes may be of any suitable length, which span all or a portion of the identifier nucleotide sequence, and which allow specific hybridization to the region. If the target sequence contains a sequence identical to that of the probe, the probes may be short, e.g., in the range of about 8-30 base pairs, since the hybrid will be relatively stable under even highly stringent conditions. Of course longer probes, up to 100 base pairs or more, may also be employed which hybridize to the target sequence with the requisite specificity.
The probes will include an isolated polynucleotide attached to a label or reporter molecule and may be used to isolate other polynucleotide sequences, having sequence similarity by standard methods. For techniques for preparing and labeling probes see, e.g., Sambrook et al., 1989 or Ausubel et al., 1992. Other similar polynucleotides may be selected by using homologous polynucleotides.
Probes comprising synthetic oligonucleotides or other polynucleotides of the present invention may be derived from naturally occurring or recombinant single- or double-stranded polynucleotides, or be chemically synthesized. Probes may also be labeled by nick translation, Klenow fill-in reaction, or other methods known in the art.
xe2x80x9cProgenyxe2x80x9d refers to any progeny of a particular variety of germplasm and is intended to include progeny of the same variety as well as progeny resulting from the use of the particular variety in a breeding program.
xe2x80x9cRecombinant nucleic acidxe2x80x9d is a nucleic acid which is not naturally occurring, or which is made by the artificial combination of two otherwise separated segments of sequence. This artificial combination is often accomplished by either chemical synthesis means, or by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques. Such is usually done to replace a codon with a redundant codon encoding the same or a conservative amino acid, while typically introducing or removing a sequence recognition site. Alternatively, it is performed to join together nucleic acid segments of desired functions to generate a desired combination of functions.
xe2x80x9cSubstantially complementary toxe2x80x9d refers to a probe or primer sequences which hybridize to the sequences provided under stringent hybridization conditions and/or to sequences having sufficient homology with an identifier nucleotide sequence, such that the specific probe or primers hybridize to the sequences to which they are complimentary.
xe2x80x9cTarget regionxe2x80x9d refers to a region of the nucleic acid which is amplified and/or detected. The term xe2x80x9ctarget sequencexe2x80x9d refers to a sequence with which a probe or primer will form a stable hybrid under desired conditions.
xe2x80x9cTransgenic organellexe2x80x9d refers to an organelle which contains an identifier nucleotide seqeunce, preferably incorporated into its genome.
The practice of the present invention employs, unless otherwise indicated, conventional techniques of chemistry, molecular biology, microbiology, recombinant DNA, genetics, and immunology. See, e.g., Maniatis et al., 1982; Sambrook et al., 1989; Ausubel et al., 1992; Glover, 1985; Anand, 1992; Guthrie and Fink, 1991; Gelvin et al., 1990; Grierson et al., 1984.
In accordance with the present invention an identifier nucleotide sequence is selected which can serve to identify the source of germplasm. Using an identifier nucleotide sequence which is not homologous with DNA of the germplasm, e.g., Zea mays L., permits the detection of the presence of the identifier nucleotide sequence simply and accurately. In one embodiment, the identifier nucleotide sequence is not homologous with the germplasm""s organellar DNA. For example, the identifier nucleotide sequence is chosen such that it does not not occur naturally in the chloroplast genome. The nucleotide sequence of the chloroplast genome for several plant species is known. See for example, Hiratsuka et al. (1989), Ohyama et al. (1986), Shinozaki et al. (1986), Maier et al. (1995), Wakasugi et al. (1997) and Hallick et al. (1993). The nucleotide sequence of the mitochondrial genome of several species is known. See for example, Janke et al. (1997), Lopez et al. (1996), Unseld et al. (1997), Roe et al. (1985), Gadaleta et al. (1989), Zhou et al. (1995) and Wolff et al. (1994). Additional nucleotide sequences for organelles can be found in the xe2x80x9cOrganelle Genome Databasexe2x80x9d (Korab-Laskowska et al., 1998). In a more preferred embodiment, the identifier nucleotide sequence is not homologous with any of the germplasm""s DNA, both nuclear and organellar. Thus, in the more preferred embodiment, the identifier nucleotide sequence can be derived from a different organism, e.g. a chicken nucleotide sequence could be selected as the identifier nucleotide sequence for plants. This identifier nucleotide sequence could include the complete or partial sequence (genomic or cDNA) for a gene or could include only the promoter element. Alternatively, the identifier nucleotide sequence can be a synthetic sequence which is designed such that it is not present in the DNA of the germplasm. For example, a synthetic sequence can be prepared which contains codons spelling out the name of the company which owns the germplasm in question. Any nucleotide sequence selected for use in the present invention can be determined to be unique by conventional techniques. For example, the nucleotide sequence, its complement or a part thereof can be used as a probe in a hybridization assay of the organellar DNA or the total DNA of the germplasm of interest. Alternatively, primers for the nucleotide sequence can be used to determine if any nucleic acid is amplified in amplification reactions. Preferably, the identifier nucleotide sequence is selected so that it is not capable of being expressed in the germplasm.
Although naked DNA can be used to transform host cells, it is preferred to use vectors which contain the identifier nucleotide sequence. The vectors may also contain selectable marker sequences under control of appropriate host recognizable promoters for use in selecting transformed cells. Suitable markers and promoters are well known in the art. Alternatively, the presence of the integrated identifier nucleotide sequence can be confirmed by assaying for the presence of the identifier nucleotide sequence in transformed tissue. In a preferred embodiment, the vectors for transferring the identifier nucleotide sequence into the desired organelles, or the identifier nucleotide sequence if it is to be used directly, also contains means for inserting the identifier nucleotide sequence and selectable marker into the organelle. The insertion means preferably comprises regions of homology to the target organellar genome which flank the identifier nucleotide sequence and/or selectable marker sequence. Since it is preferred that the identifier nucleotide sequence not be expressed, the identifier nucleotide sequence does not need to be placed under the control of a promoter.
The identifier nucleotide sequence can be introduced into the host cell by well-known methods, e.g., by injection (see, Kubo et al., 1988), or vectors containing the identifier nucleotide sequence can be introduced directly into host cells by methods well known in the art, which vary depending on the type of cellular host, including electroporation; transfection employing calcium chloride, rubidium chloride calcium phosphate, DEAE-dextran, or other substances; microprojectile bombardment; lipofection; infection (where the vector is an infectious agent, such as a retroviral genome or Agrobacteium tumefaciens); and other methods. See generally, Sambrook et al., 1989; Ausubel et al., 1992 and Gelvin et al., 1990. The introduction of the polynucleotides into the host cell or organelles of a host cell by any method known in the art, including, interalia, those described herein will be referred to herein as xe2x80x9ctransformation.xe2x80x9d The cells into which have been introduced nucleic acids described above are meant to also include the progeny of such cells. Thus, the present invention contemplates plant or animal organelles containing the identifier nucleotide sequence as well as plants, plant seeds, plant cells, animals or animal cells containing organelles with the identifier nucleotide sequence.
In accordance with the present invention, host cells are transformed such that the identifier nucleotide sequence is integrated into the genome of the host cell""s organelles. The transformation of plant plastids with the identifier nucleotide sequence can be carried out as described in U.S. Pat. Nos. 5,451,513; 5,545,817; 5,545,818; and 5,576,198; Svab et al., 1990; Svab and Maliga, 1993; Daniell et al., 1990; and Staub et al., 1992. The transformation of mitochondria can be carried out as described in Johnston et al. (1988) and Fox et al. (1988). It is preferred to use microprojectile bombardment, such as described in these references, for the transformation of plant or animal organelles. Generally, plant tissue is bombarded with particles coated with either the identifier nucleotide sequence or a vector containing the identifier nucleotide sequence. The bombarded tissue is cultured on a cell-division-promoting meda, after which transformed tissue is selected. If a selectable marker is used, transformed tissue is selected by culturing plant tissue on selective media. If a selectable marker is not use, transformed tissue is selected by assaying for the presence of the identifier nucleotide sequence. Plants are then obtained from the transformed tissue. This general procedure is adapted according to the general transformation and regeneration methods which have been developed for a given plant species.
General transformation and regeneration methods are well known in the art for most plant species. Plants which have been transformed and regenerated include the following: corn (U.S. Pat. Nos. 5,484,956; 5,489,520; 5,177,010; 5,641,664; and 5,350,689); soybeans (U.S. Pat. Nos. 5,015,580; 5,416,011; 5,563,055; and 5,024,944); alfalfa (U.S. Pat. Nos. 5,324,646 and 5,736,627; McKersie et al., 1993); sorghum (U.S. Pat. No. 5,723,764); wheat (U.S. Pat. Nos. 5,610,042; 5,631,152; and 5,589,617); rice (U.S. Pat. Nos. 5,679,558 and 5,736,369); oats and barley (U.S. Pat. Nos. 5,723,764 and 5,187,073); rye, millet and triticale (U.S. Pat. No. 5,723,764); orchardgrass (Denchev et al., 1997); tall fescue (Dalton et al., 1995); monocot temperate grasses (Horikawa et al., 1993); red clover (Poerba et al., 1997); white clover (Ealing et al., 1994); subterranean clover (Khan et al., 1996); cotton (U.S. Pat. Nos. 5,004,863; 5,159,135; and 5,597,718); rapeseed (U.S. Pat. Nos. 5,530,192 and 5,188,958); potato (U.S. Pat. Nos. 5,510,523 and 5,589,612); and tomato (U.S. Pat. Nos. 5,569,831 and 5,569,829).
General methods for producing transgenic animals are well known in the art. See for example, U.S. Pat. Nos. 5,476,995; 5,175,385; 5,573,933; 5,489,742; 5,741,957 and 5,589,604.
The cells of the germplasm, plant or animal, produced in accordance with the present invention contain organelles which contain an identifier nucleotide sequence, preferably integrated into the organellar genome. As is well known in the art, organelles are only transferred to future generations through the female parent and are not passed transferred to progeny by pollen or sperm. Thus, organelles containing an identifier nucleotide sequence in accordance with the present invention will only be passed on to offspring by the female parent. The germplasm of the present invention can be used in any conventional manner, for example to produce progeny of the same variety or to produce new varieties in a breeding program. Since the transgenic organelle is transferred by the female parent, it is possible to identify progeny, whether of the same variety or of a new variety, which has been produced using transgenic germplasm containing an identifier nucleotide sequence. Thus, a further embodiment of the present invention is a method to determine the source of new germplasm, i.e., germplasm suspected of being or having been derived from the transgenic germplasm. Consequently, the source or parentage of the tested germplasm with respect to germplasm containing the identifier nucleotide sequence can be established by the present invention.
In order to detect the presence of the identifier nucleotide sequence in germplasm to be tested, e.g. germplasm suspected of being the same variety as the transgenic germplasm or having been derived from the transgenic germplasm, a sample from the suspected germplasm is prepared and analyzed for the presence or absence of the identifier nucleotide sequence. Generally, the sample is prepared so that the total nucleic acid or the nucleic acid of the organelles can be analyzed.
The detection of the identifier nucleotide sequence in the germplasm to be tested can be accomplished by any technique known in the art. Such techniques include, but are not limited to, the use of probes for the identifier nucleotide sequence or the amplification of the identifier nucleotide sequence. Thus, the present invention contemplates the use of both PCR and non-PCR based screening strategies to detect target sequences with a high level of sensitivity. Such screening methods include two-step label amplification methodologies that are well known in the art.
When a probe (an oligonucleotide or an analog such as a methyl phosphonate backbone replacing the normal phosphodiester), is used to detect the presence of the target sequences the biological sample to be analyzed may be treated, if desired, to extract the nucleic acids. If the identifier nucleotide sequence is homologous to nuclear DNA, the organelles are first isolated before the organellar nucleic acid is extracted. If the identifier nucleotide sequence is not homologous to any of the nuclear or organellar DNA, the nucleic acid can be extracted without isolation of the organelles, although their isolation prior to extraction is preferred. The sample nucleic acid may be prepared in various ways to facilitate detection of the target sequence, e.g., denaturation, restriction digestion, electrophoresis or dot blotting. The targeted region of the analyte nucleic acid usually must be at least partially single-stranded to form hybrids with the targeting sequence of the probe. If the sequence is naturally single-stranded, denaturation will not be required. However, if the sequence is double-stranded, the sequence will probably need to be denatured. Denaturation can be carried out by various techniques known in the art.
Analyte nucleic acid and probe are incubated under conditions which promote stable hybrid formation of the target sequence in the probe with the putative targeted sequence in the analyte. The region of the probes which is used to bind to the analyte can be made completely complementary to the identifier nucleotide sequence. Therefore, high stringency conditions are desirable in order to prevent false positives. The stringency of hybridization is determined by a number of factors during hybridization and during the washing procedure, including temperature, ionic strength, base composition, probe length, and concentration of formamide. These factors are outlined in, for example, Maniatis et al., 1982 and Sambrook et al., 1989. Under certain circumstances, the formation of higher order hybrids, such as triplexes, quadraplexes, etc., may be desired to provide the means of detecting target sequences.
Detection, if any, of the resulting hybrid is usually accomplished by the use of labeled probes. Alternatively, the probe may be unlabeled, but may be detectable by specific binding with a ligand which is labeled, either directly or indirectly. Suitable labels, and methods for labeling probes and ligands are known in the art, and include, for example, radioactive labels which may be incorporated by known methods (e.g., nick translation, random priming or kinasing), biotin, fluorescent groups, chemiluminescent groups (e.g., dioxetanes, particularly triggered dioxetanes), enzymes, antibodies, gold nanoparticles and the like. Variations of this basic scheme are known in the art, and include those variations that facilitate separation of the hybrids to be detected from extraneous materials and/or that amplify the signal from the labeled moiety. A number of these variations are reviewed in, e.g., Matthews and Kricka, 1988; Landegren et al., 1988; U.S. Pat. No. 4,868,105; and in EP 225,807A. For example, a probe may have an enzyme covalently linked to the probe, such that the covalent linkage does not interfere with the specificity of the hybridization. This enzyme-probe-conjugate-target nucleic acid complex can then be isolated away from the free probe enzyme conjugate and a substrate is added for enzyme detection. Enzymatic activity is observed as a change in color development or luminescent output resulting in a 103-106 increase in sensitivity. For an example relating to the preparation of oligodeoxynucleotide-alkaline phosphatase conjugates and their use as hybridization probes, see Jablonski et al., 1986.
Alterantively, the screening method can involve amplification of the relevant identifier nucleotide sequence. The most popular method used today is target amplification. Here, the target nucleic acid sequence is amplified with polymerases. One particularly preferred method using polymerase-driven amplification is the polymerase chain reaction (PCR). The polymerase chain reaction and other polymerase-driven amplification assays (as described above) can achieve over a million-fold increase in copy number through the use of polymerase-driven amplification cycles. The DNA isolated from the suspected germplasm is amplied using conventional techniques which includes automation. The primers for DNA amplification are selected so that only the identifier nucleotide sequence, if present, is amplified. Suitable primers may include unique sequences derived from the 5xe2x80x2 and 3xe2x80x2 ends of the identifier nucleotide sequence or from unique sequences 5xe2x80x2 of and 3xe2x80x2 of the identifier nucleotide sequence from the construct used to introduce the identifier nucleotide sequence into the organelles. In addition, combinations of these primers can be used. Once amplified, the resulting nucleic acid can be detected using conventional techniques, including the use of probes, such as described above. Alternatively, the presence of the identifier nucleotide sequence can be determined by identifying the presence of amplified nucleic acid.
Two-step label amplification methodologies are known in the art. These assays work on the principle that a small ligand (such as digoxigenin, biotin, or the like) is attached to a nucleic acid probe capable of specifically binding to the identifier nucleotide sequence.
In one example, the small ligand attached to the nucleic acid probe is specifically recognized by an antibody-enzyme conjugate. In one embodiment of this example, digoxigenin is attached to the nucleic acid probe. Hybridization is detected by an antibody-alkaline phosphatase conjugate which turns over a chemiluminescent substrate. For methods for labeling nucleic acid probes according to this embodiment see Martin et al., 1990. In a second example, the small ligand is recognized by a second ligand-enzyme conjugate that is capable of specifically complexing to the first ligand. A well known embodiment of this example is the biotin-avidin type of interactions. For methods for labeling nucleic acid probes and their use in biotin-avidin based assays see Rigby et al., 1977 and Nguyen et al., 1992.