Recent progress in genome base sequence analysis technologies has provided a rapid improvement in analysis speed and a reduction in analysis cost, whereby analysis of the structure of genomes of various organisms is proceeding at dramatic speed. Current methods for inferring a function of a gene from its deciphered base sequence depend on the presence or absence of a similar sequence found by searching sequence data of DNA or protein registered in an international database, such as GenBank or DDBJ, using, for example, PSI-BLAST algorithm (Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997)) or the like. In these methods, the similarity between sequences and functions is estimated from the similarity between known nucleic acids or proteins and sequences. In order to estimate the amino acid sequence of a protein coded by a base sequence, various programs aimed at gene modeling for predicting the positions of exons or introns from the genome base sequence have been developed. An attempt to realize full automation is proceeding, however accurate gene modeling essentially requires manual editing of a result of prediction, though accuracy problems remain unsolved.
Some databases and programs for predicting gene expression sites based on a genome base sequence have been developed (Ghosh, Nucleic Acids Res., 21:3117-3118 (1993); Ghosh, Nucleic Acids Res., 26:360-361 (1998); Heinemeyer et al., Nucleic Acids Res., 26:362-367 (1998)), though accuracy problems remain unsolved.
For plants, a database compiling about 400cis element motifs in an expression control region for a plant gene, to which transcription regulatory elements bind has constructed (Higo et al., Nucleic Acids Res. 26:358-359 (1998); Nucleic Acids Res. 27:297-300 (1999)). When analysis is carried out using a base sequence inferred as a promoter as a query, each cis element motif present in the base sequence is displayed. However, although there is a possibility that these function as cis elements, no evidence exists that these actually function as cis elements. Therefore, there is a demand for the development of a method for inferring a gene expression site (expression tissue/expression organ) using a genome base sequence.
Clarification of gene expression sites would help reveal functions of individual genes and could make it possible to isolate and utilize a promoter portion. In the field of plants, development of tissue-specific promoters would make possible gene expression specific to individual tissues using transformation technologies or inhibition of gene expression. For example, if an anther-specific promoter were developed, the following applications would be expected.
It has been known that a F1 hybrid (first filial generation) generated by crossing between varieties may have a more excellent property than that of its parents. This inter-variety crossing has conventionally attracted attention as a method for breeding crops. For crops, such as rice, which perform self-pollination, methods for producing a male sterility strain have been studied as a technology required for utilization of such a property. Conventionally, male sterility strains have been searched for among plant gene resources, or mutagenesis has been used for selection of a male sterility strain. However, these methods have difficulty in introducing a male sterility gene into a commercial variety and their use is limited.
A recent promising approach is a method of utilizing biotechnology to link a promoter, which expresses in an anther and/or pollen, with a gene having a function to inhibit formation of an anther and/or pollen (e.g., nuclease, protease, and glucanase) and introduce the linked genes into a plant so as to prevent formation of fertile pollen. An alternative promising approach is a method of using a promoter, which is to be expressed in an anther and/or pollen, so as to transcribe antisense RNA for a gene which is to be expressed upon formation of an anther and/or pollen, or a method of introducing ribozyme, which decomposes mRNA for the gene, into a plant.
There are several known promoters for genes which are expressed in an anther and/or pollen. However, unfortunately, the activities of the promoters are too low for practical use, or the expression time thereof is limited. It would be very useful to isolate a promoter which functions at each developmental stage of an anther or pollen, clarify features of each promoter, and produce a promoter cassette having a high activity so as to artificially control formation of an anther and/or pollen.
Therefore, for example, if a promoter, which has a high activity, may be practically used, and is directed to a desired site (e.g., an anther or pollen), can be obtained from a gene of rice, such a promoter can contribute much to breeding of crops, such as rice. Further, in order to modify a component of each tissue of flower, such as a protein involved in adhesion of a petal pigment or pollen to a pistil, it is necessary to obtain a gene which is to be expressed in a flower.
To this end, required is a method for efficiently searching a DNA database, in which a vast number of genome base sequences are stored, for a gene which is to be expressed in a flower, or a method for efficiently screening a genome DNA library for a gene which is to be expressed in a desired site (e.g., flower).