Data generated from DNA sequencing machines are growing at an unprecedented rate. Extracting knowledge from this data is extremely tedious and usually requires very powerful computing machines. The main reason is that the volume of data generated for an experiment usually contains redundant data. Accordingly, extracting of useful information and removal redundant information becomes an important part of at the processing step. As an example, during the entire genome sequencing of Human genome with 100× coverage, each base, on average, presents 100 sequencing reads which means 99 percent of the data is redundant. Hence, there is a need for systems and methods to reduce the number of sequenced bases significantly in a sequencing process of a string of oligo-nucleotides, for example, a DNA sequencing process. Also, there is a need to decrease the generated data of read oligo-nucleotides to determine a nucleic acid sequence, for example, by avoiding reading and processing redundant or repetitive fragments.