1. Technical Field
The following embodiments relate to genome analysis. More particularly, the following embodiments disclose a method and an apparatus for detecting chromosomal translocation.
2. Description of the Related Art
DNA, a molecule that carries most of the genetic instructions in organisms including humans, is composed of nucleotides, each having a nucleobase, that is, adenine, cytosine, guanine, or thymine, which are abbreviated by single letter codes A, C, G and T, respectively
Genome analysis is a process for analyzing the difference between two DNA sequences. For example, DNA sequences to be compared may be derived from different persons. Thus, genome analysis is also called genome variation analysis.
With the development of next-generation sequencing technology, active research into genome analysis has been conducted.
Of genome variations, structural variation (SV) is the variation in structure of an organism's chromosome, and typically affects a sequence length about 1 kb or greater. Since many structural variants are associated with genetic diseases, structural variations have recently been under intensive study.
Structural variation usually includes deletions, duplications, inversions and translocations. Thus far, extensive research has focused on finding deletions and duplications. In recent years, detection of inversions and translocations has been studied. However, there still exist many false positives in detecting inversions and translocations. The existence of such false positives makes it difficult for biologists to utilize their research results in the field.
For the analysis of structural variations, detection may be carried out taking advantage of information of read depth (RD), paired end (PE), and split read (SR).
RD refers to the number of times a nucleotide sequence is read at each locus of a genome.
Traditionally, a method utilizing RD is widely applied to the analysis of copy number variation (CNV). However, methods utilizing RD are limited in detecting copy number-neutral SVs including inversions and translocations.
Methods using PE and/or SR are effective in detecting the position of an SV break point (BP), irrespective of whether the copy number is neutral or not. With regard to a genome characterized by diploidy, however, methods using PE and/or SR are unable to provide information on whether SV is present either or both of the paired chromosomes.
By integrating information on RD, PE and SR, attempts have been made to overcome the disadvantages arisen when information on RD, PE and SR is separately utilized. Nonetheless, limitations of conventional methods for detecting translocations still remain unsolved due to the complexity of translocations.
With regard to translocations on genomes, reference may be made to Korean Patent Unexamined Application Publication No. 2014-0061223, U.S. Patent Application No. 20130158885, and U.S. Pat. No. 7,948,564.