While there is currently a desire to provide a relatively inexpensive (e.g., about $1000) genome sequencing technology of acceptable read length (e.g., about 100 bps), accuracy (e.g., one base error in about 10,000 bps) and high-speed (e.g., a turn-around time of less than about a day), it can be even more important to build the infrastructure that can facilitate the resulting data, most of which can be applied in clinical setting, to be transmitted, stored, queried and accessed as accurately, efficiently, securely and effortlessly as possible. One limiting factor to using the next generation sequencing technology for clinical purposes can be that the currently dominant genomics technologies produce data having either low accuracy or short read length, which has required additional post-processing by a remote super computer with large storage space, for example, a cloud computer. The process can have significant risks not just in terms of inaccurate data interpretation resulting in unnecessary or even disastrous clinical interventions, but also in the loss of privacy of the patient data. Exacerbating these problems, the current process incurs significant costs in transmission and storage.
Next-generation clinical sequencing is undergoing a period of incredibly rapid growth. Its applications span nearly all fields of medicine, from the prediction of drug allergies to the diagnosis of childhood diseases and guidance of cancer treatments. It has become an important tool for basic biomedical research and is seeing significant adoption in the clinical diagnosis of inherited monogenetic disorders, and the profiling of acquired and somatic mutations to guide therapeutic choice and inform prognosis in cancer. (See e.g., References 1 and 2). Emerging clinical applications of next-generation sequencing include monitoring transplant rejection and non-invasively diagnosing a variety of prenatal diseases and conditions. (See e.g., References 1 and 2). Accompanying this rapid expansion are large-scale bioinformatics challenges. The data generated by sequencers currently suffers from inefficiencies in both processing and long-term storage. This situation translates into greater error rates, higher costs and longer wait times for actionable medical information.
This problem is particularly acute for clinical sequencing laboratories as the changing regulatory landscape for healthcare, combined with variation in federal and state laws regarding medical record storage needs (see e.g., Reference 3), results in most DNA sequencing labs storing data indefinitely. With sequence data generation forecast to increase exponentially in the near future, many practitioners are concerned that a data storage crisis is looming. Surprisingly, most clinical sequencing centers abstain from compressing the sequencing data they store, primarily due to the lack of a data-secure, scalable and easy-to-use tool for sequence compression.
Thus, it may be beneficial to provide an exemplary system, method and computer-accessible medium for secure and compressed transmission of genomic data, which can overcome at least some of the deficiencies described herein above.