1. Technical Field
This present invention relates to methods of manipulation, storage, modeling, visualization and quantification of datasets.
2. Background Art
The standard techniques currently employed to analyze large datasets are Cluster Analysis and Self-Organizing Maps. These approaches can be effective in identifying broad groupings of genes connected with well understood phenotypes but fall short in identifying more complex gene interactions and phenotypes, which are less well defined. They do not allow for the fingerprinting and visualization of an entire dataset, and missing values are not easily accommodated. The computational requirements are high for these techniques, and the mapping time increases exponentially with the size of the dataset. Furthermore, the current data must be reanalyzed when new datasets are added to the analysis, and vastly different results can occur for each new dataset or group of datasets added.
In order to take full advantage of the information in multiple, large sets of data, we need new, innovative tools. There is a need for methods that more easily enable identification and visualization of potentially significant similarities and differences between multiple datasets in their entirety. There is also a need for methods to intelligently store and model large datasets.
Recent studies have revealed genome-wide gene expression patterns in relation to many diseases, and physiological processes. These patterns indicate a complex network interaction involving many genes, and gene pathways, over varying periods of times. On a parallel track, recent studies involving mathematical models and biophysical analysis have shown evidence of an efficient, robust, network structure for information transmission when these networks are examined as large-scale gene groups. The problem comes in producing analysis of information transmission and network structure on the scale of individual genes and genetic pathways. Fractal Genomics Modeling (FGM) solves this problem by taking advantage of universal principles of organization. From the Internet, to social relations, to biochemical pathways, the fundamental patterns are similar. The natural relationship among many different types of networks, when mathematically represented, enables the extrapolation of vast quantities of data, capable of computerized analysis. FGM is computationally efficient because the method is performed incrementally, is almost perfectly parallel, and is substantially linear. Consequently, there is no scaling problem with FGM. Furthermore, of significant interest, FGM can be used to identify biomarkers and develop systems for diagnoses or prognoses of disease by exploiting the map of interactions and causality-pathway conjecture-rendered by this technology.