Molecular imaging is experiencing a considerable upswing in the evolution of the number of technologies especially in mass spectrometry imaging (MSI), magnetic resonance imaging (MRI), or Raman spectrometry. These technologies are used to conduct studies in bio-distribution of targeted endogenic or exogenic compounds such as pesticides, drugs, proteins or lipids for studying their roles in biological systems. Applications of these technologies with competition from software also search for potential non-supervised bio-markers.
But, the increase in data to be analyzed and interpreted in molecular imaging constitutes a limitation for synthetic analyzes on several data sets each associated with an image. Data are becoming difficult to exploit for synthetic representations or to be analyzed statistically.
Indeed, the high volume of information of data sets is often greater than the size of the random access memory of conventional computers (as opposed to supercomputers). With mass spectrometry imaging for example, images of 50,000 positions can no longer store all the raw information in memory and therefore need to perform calculations to reduce impact on memory and allow exploitation of these images. Calculations reduce data and consequently cause bias by not considering all available information. Also, current analysis tools have no means for standardizing several data sets relative to each other.
There are many storage formats mainly linked to manufacturers of acquisition robots of data sets. However, none of these formats is adapted and optimized to be interrogated finely and rapidly. Some fields of molecular imaging have emphasized storage formats such as Analyze 7.5 found in MRI (Magnetic Resonance Imaging) and mass spectrometry imaging. Mass spectrometry imaging also contains the format iMZML which however does not constitute a storage system for handling imaging to data.
Formats of type HDFS (“Hierarchical Data Format 5”) based on hierarchical organization of data have recently been used in the storage of high-volume data in mass spectrometry imaging. These formats build remote interrogation interfaces via Internet navigators. They enable faster statistical calculations than with non-hierarchized files. These formats however do not produce overall comparative and searchable analysis of data in studies of several data sets simultaneously.
There is therefore the need for a process for analyzing, comparing, and interrogating several data sets relative to each other without information loss so as to avoid introducing analysis bias.