Biomedical research workflows currently suffer from a lack of tools that enable the collection of the experimental context. The multivariate nature of the experimental context makes it difficult to continuously record the evolution of all of the variables over the length of the experiment. As a result, many variables are never recorded, turning the debugging of a failed protocol into a guessing game. This not only makes the discovery of significant experimental variables difficult, but also constitutes the main barrier in training new individuals on these workflows.
The lack of contextual data is a key factor in the current inability to apply a data analytics approach to the biomedical experimental workflow. Currently, the only way to create sufficient data to enable such an approach requires the use of expensive and inflexible automation platforms. These platforms require significant adaptation of user workflows that only make such an investment worthwhile if large numbers of samples are used or if experiments are composed of multiple repeated operations.
Experimental protocols are nominally descriptive of the steps necessary to perform an experiment but often fail to provide all the accessory information crucial to understanding the experiment's context. Currently, experimental protocols are generally described in text documents that are optimized for human readability and comprehension. Their format does not allow easy machine readability, making it very difficult to programmatically extract the context of an experiment. While experimental description schemas have been proposed in the past [e.g. Systems Biology Markup Language, ExptML: A Markup Language for Science], they are optimized for machine readability, making it very hard for experimenters—who often have little experience with programming—to either read or produce them.