This invention relates to systems for the collection and manipulation of biodata from diverse sources and the processing of such biodata to identify potential therapeutic targets.
The human genome project, with its goal of complete genome sequencing, has examined three gigabytes of human genomic DNA and predicts that approximately 30,000 genes are resident in the human genome. However, identification and sequencing of a gene are but the first steps in its characterization. The challenge is to determine the function of the gene as well as its relationship to other genes. With this information, directed experimentation to identify genes that are likely targets for therapeutic intervention becomes feasible and, ultimately, the drug discovery timeline will be shortened.
Genes contain genetic information that is transcribed into messenger RNA and then translated into protein. Proteins play a critical role in cellular processes. Functional proteomics seeks to identify a protein's function and related pathway roles through large-scale, high-throughput experiments. Protein functional analysis systematically determines protein-protein interactions. Protein interactions mediate cellular signaling cascades that are not typically linear, but are more likely represented by a complex branched network. When unknown proteins interact with previously characterized proteins, information about their function and role in the same or related cellular process may be obtained.
Most commercially available bioinformatics systems perform functional analysis using a single information source such as a traditional relational database optimized for transactional database processing. Such systems do not integrate collections of data from various sources. Conversely, an intelligent system that integrates data derived from multiple sources would allow for the integration of data from various operational databases and, thus, enhance research efforts which focus on specific therapeutic targets.