As a method of decoding a DNA base sequence, DNA sequencing combining the fragmentation technology of a nucleic acid using the Sanger's method, the fluorescence labeling technology of nucleic acid fragments, the high-resolution electrophoretic technology, and further the sensitive fluorescence detection technology is widely used.
According to the DNA sequencing, DNA (template DNA) whose base sequence should be decoded is first prepared and a replication reaction of the template DNA is caused using a primer having a sequence complementary to a sequence of a portion of the template DNA. At this point, if dideoxynucleotide as chain termination nucleotide is mixed in a predetermined proportion in a reaction solution together with deoxynucleotide, the synthetic reaction stops in a position where dideoxynucleotide is incorporated and thus, nucleic acid fragments of various lengths can be generated. If the primer or dideoxynucleotide is labeled with fluorescent dyes of different colors for each base type, each nucleic acid fragment is labeled with the dye corresponding to the terminal base thereof. Nucleic acid fragments created in this manner are separated based on the base length by electrophoresis using a capillary electrophoretic device or the like. Each nucleic acid fragment is irradiated with laser at the end of electrophoresis to measure fluorescence emitted from the terminal base of each fragment by a detector. A shorter nucleic acid fragment moves faster in electrophoresis and thus, fluorescence intensity waveform data corresponding to the base sequence is obtained as chronological fluorescence measured data.
A DNA sequencer using the DNA sequencing is an apparatus that determines the base sequence by comparing intensity of four types of fluorescent signals in each peak position of the fluorescence intensity waveform data.
A gene mutation called single nucleotide polymorphism is known to exist in base sequences of genome of human beings and the like. A congenital gene mutation inheritable from parents to children is called a germline mutation. The genome of many living beings including human beings is constituted as diploid and thus, concerning a germline mutation, two bases exist in individuals or cells in the proportion of 50% respectively. When a region in which such single nucleotide polymorphism exists is analyzed by the DNA sequencer, fluorescent signal peaks corresponding to two bases are detected simultaneously in positions corresponding to the single nucleotide polymorphism of the fluorescence intensity waveform data.
According to the aforementioned sequencing, however, fluorescence intensity waveform data that makes the determination of the polymorphism or the determination of the base sequence difficult may be obtained. A case when the amount of nucleic acid sample is small and signal intensity is weak, a case when excessive signal components are generated due to a higher-order structure of nucleic acid fragments, or a case when signals are distorted by conditions during chemical treatment or electrophoresis can be considered as the cause thereof.
When the base sequence of an actual nucleic acid sample is determined, like when a certain gene is examined for a gene mutation, at least a portion of the base sequence of the nucleic acid sample is often known. When such a known base sequence exists, newly acquired fluorescence intensity waveform data can be interpreted by referring to information of fluorescence intensity waveform data obtained from the known base sequence by some method. When such a known base sequence exists, as disclosed by PTL 1 and PTL 2, newly acquired fluorescence intensity waveform data can be interpreted by referring to information of fluorescence intensity waveform data obtained from the known base sequence by some method.