1. Field of the Invention
The present invention relates to a method of designing primer and probe sets for identification of a target sequence, primer and probe sets designed by the method, a kit comprising the sets, a computer readable medium recorded thereon a program to execute the method, and a method of identifying a target sequence using the sets.
2. Description of the Related Art
A microarray is a substrate on which polynucleotides are immobilized at fixed locations. Such a microarray is well known in the art and examples thereof can be found in, for example, U.S. Pat. Nos. 5,445,934 and 5,744,305. Also, it is known that the microarray is generally manufactured using photolithography. When using photolithography, the polynucleotide microarray can be manufactured by repeatedly exposing an energy source to a discrete known region on a substrate, in which a monomer protected by a removable group is coated, to remove the protecting group, and coupling the deprotected monomer with another monomer protected by the removable group. In this case, the polynucleotide immobilized on a microarray is synthesized by extending monomers of the polynucleotide one by one. Alternatively, when using a spotting method, a microarray is formed by immobilizing previously-synthesized polynucleotides at fixed locations. Such methods of manufacturing a microarray are disclosed in, for example, U.S. Pat. Nos. 5,744,305, 5,143,854, and 5,424,186. These documents related to microarrays and methods of manufacturing the same are incorporated herein in their entirety by reference.
A typical method of searching or identifying the genotype of a target sequence using a microarray includes amplifying the target sequence using a certain primer, applying the amplified target sequence to the microarray, hybridizing the amplified target sequence with a polynucleotide (also called “a probe”, “a probe nucleic acid”, or “a probe polynucleotide”) which is immobilized on the microarray, washing the microarray to remove a non-specific reaction, and detecting a fluorescent signal due to the formation of a target sequence-probe hybrid. In the method of identifying a target sequence, the obtained results vary according to how to design the primer and probe. Thus, a method of efficiently designing a primer and a probe for identification of a target sequence is urgently required.
The conventional method of designing a primer and a probe is based on multiple sequence alignment. The multiple sequence alignment is to align three or more sequences which include mutation such as substitution, addition or deletion such that the number of identical base alignment is greatest. FIG. 2 is a schematic diagram of the conventional multiple sequence alignment. Referring to FIG. 2, after aligning target sequences such that the number of identical base alignment is maximum, common regions (a/c and e/g) are selected as primers and unique regions (b, d, h, f, i) are selected as probes to design primer and probe sets.
In the conventional method of designing primer and probe sets, since only a specific sequence which hybridizes with each target sequence but does not cross-hybridize with other sequences is selected as a probe, when the sequence homology between target sequences is high or the number of target sequences to be identified is large, a specific probe cannot be designed. Moreover, when a universal primer cannot be designed, an optimum primer for a subgroup cannot be directly proposed and a separate design method is required. Even though a primer and a probe can be designed, additional information for design is required and a processing rate is reduced.
For example, to identify species of bacteria in a sample, a primer and a probe have been designed using a 16S rRNA site, which is one of consensus sequences, on the basis of the multiple sequence alignment. That is, after aligning the 16S rRNA site according to the multiple sequence alignment, common sequences in the species are used as primers and unique sequences in the species are used as probes. Such a method can be used to identify several species of bacteria, but is limited in identification of many species of bacteria since the 16S rRNA site is highly conserved.
The inventors of the present invention found that primer and probe sets capable of rapidly and accurately identifying a number of target sequences can be readily designed by repeating an operation of preparing subsequence sets of target sequences and an operation of selecting two subsequence sets in which the number of subsequences having a high homology is greatest, and thus completed the present invention.