The present invention relates to minimizing epigenetic surprisal data, and more specifically to minimizing epigenetic surprisal data within a time series or time period.
Epigenetics includes the study of heritable changes in gene expression that are not due to changes in DNA sequence, in other words, all modifications to genes other than changes to the DNA sequence itself Examples of modifications are DNA methylation, histone modification, chromatic accessibility, acetylation, phosphorylation, ubiquitination, ADP-ribosylation and others. The modifications alter the chromatin structure of the DNA and its accessibility, and therefore the regulation of gene expression patterns. The pattern of gene expression can also be modified by exogenous influences, such as environmental influences including nutrition. These modifications can persist throughout an organism's lifetime and be passed onto to future generations.
Epigenetic maps include a map or display of what modifications have been made to specific chromosomes and/or the entire genome of an organism. Epigenetic maps are produced by massively parallel sequencing of a portion of an organism's genome or the entire genome and mapping the sequence to a reference genome assembly to infer genomic coordinates of modifications. Within the study of epigenetics it is beneficial to compare an epigenetic map taken at a point in time and compare it to an epigenetic map generated at another point of time to determine what changes have taken place in a specific time period. For an entire genome of an organism, the amount of data associated with these changes can be infinitely large. Furthermore, the transfer of such information can take up significant space and time over a network data processing system.