1. Technical Field
The present disclosure relates to the field of biomedical informatics. In particular, the present disclosure relates to a system and method for personalized biomedical information research analytics and knowledge discovery.
References
Aspects of this disclosure relate to the teachings of the following references, which are referred to throughout:    [1] Hagglund M, Scandurra I, Mostrom D, Koch S: Integration architecture of a mobile virtual health record for shared home care. Stud Health Technol Inform 2005, 116:340-345.    [2] Hanss S, Schaaf T, Wetzel T, Hahn C, Schrader T, Tolxdorff T: Integration of decentralized clinical data in a data warehouse: a service-oriented design and realization. Methods Inf Med 2009, 48(5):414-418.    [3] Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R., editors (1996). Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press.    [4] Leach S M, Tipney H, Feng W, Baumgartner W A, Kasliwal P, Schuyler R P, Williams T, Spritz R A, Hunter L: Biomedical discovery acceleration, with applications to craniofacial development. PLoS Comput Biol 2009, 5(3):e1000215.    [5] Schonbach C, Kowalski-Saunders P, Brusic V: Data warehousing in molecular biology. Brief Bioinform 2000, 1(2):190-198.
[6] Castellano M, Mastronardi G, Bellotti R, Tarricone G: A bioinformatics knowledge discovery in text application for grid computing. BMC Bioinformatics 2009, 10 Suppl 6:S23.    [7] Xiong J, Rayner S, Luo K, Li Y, Chen S: Genome wide prediction of protein function via a generic knowledge discovery approach based on evidence integration. BMC Bioinformatics 2006, 7:268.    [8] Viksna J, Celms E, Opmanis M, Podnieks K, Rucevskis P, Zarins A, Barrett A, Neogi S G, Krestyaninova M, McCarthy M I et al: PASSIM—an open source software system for managing information in biomedical studies. BMC Bioinformatics 2007, 8:52.    [9] Tsai Y S, King P H, Higgins M S, Pierce D, Patel N P: An expert-guided decision tree construction strategy: an application in knowledge discovery with medical databases. Proc AMIA Annu Fall Symp 1997:208-212.    [10] Weeber M, Klein H, Aronson A R, Mork J G, de Jong-van den Berg L T, Vos R: Text-based discovery in biomedicine: the architecture of the DAD-system. Proc AMIA Symp 2000:903-907.    [11] Friedrich C M, Dach H, Gattermayer T, Engelbrecht G, Benkner S, Hofmann-Apitius M: @neuLink: a service-oriented application for biomedical knowledge discovery. Stud Health Technol Inform 2008, 138:165-172.    [12] Parmee I C: Human-centric intelligent systems for exploration and knowledge discovery. Analyst 2005, 130(1):29-34.    [13] Brandt C A, Deshpande A M, Lu C, Ananth G, Sun K, Gadagkar R, Morse R, Rodriguez C, Miller P L, Nadkarni P M: TrialDB: A web-based Clinical Study Data Management System. AMIA Annu Symp Proc 2003:794.    [14] Katehakis D G, Sfakianakis S G, Kavlentakis G, Anthoulakis D N, Tsiknakis M: Delivering a lifelong integrated electronic health record based on a service oriented architecture. IEEE Trans Inf Technol Biomed 2007, 11(6):639-650.    [15] Blobel B G, Engel K, Pharow P: Semantic interoperability—HL7 Version 3 compared to advanced architecture standards. Methods Inf Med 2006, 45(4):343-353.    [16] Brandt C A, Gadagkar R, Rodriguez C, Nadkarni P M: Managing complex change in clinical study metadata. Journal of the American Medical Informatics Association: JAMIA 2004, 11(5):380-391.    [17] Munro R E, Guo Y: Solutions for complex, multi data type and multi tool analysis: principles and applications of using workflow and pipelining methods. Methods Mol Biol 2009, 563:259-271.    [18] Wozak F, Ammenwerth E, Horbst A, Sogner P, Mair R, Schabetsberger T: IHE based interoperability—benefits and challenges. Stud Health Technol Inform 2008, 136:771-776.    [19] Sarkar I N, Cantor M N, Gelman R, Hartel F, Lussier Y A: Linking biomedical language information and knowledge resources: GO and UMLS. Pac Symp Biocomput 2003:439-450.    [20] Bodenreider O: Biomedical ontologies in action: role in knowledge management, data integration and decision support. Yearb Med Inform 2008:67-79.    [21] U.S. Department of Health and Human Services. “Glossary of Terms for Personalized Health Care Website.” 22 May 2013. <http://www.hhs.gov/myhealthcare/glossary/glossary.html>.    [22] Benner S A, Hoshika S, Sukeda M, Hutter D, Leal N, Yang Z, Chen F: Synthetic biology for improved personalized medicine. Nucleic Acids Symp Ser (Oxf) 2008(52):243-244.    [23] Hoffman M A: The genome-enabled electronic medical record. J Biomed Inform 2007, 40(1):44-46.    [24] Rindfleisch T C, Brutlag D L: Directions for clinical research and genomic research into the next decade: implications for informatics. J Am Med Inform Assoc 1998, 5(5):404-411.    [25] Shah R, Dame B, Atar D, Abadie E, Adams K F, Zannad F: Pharmacogenomics in cardiovascular clinical trials. Fundam Clin Pharmacol 2004, 18(6):705-708.    [26] Scheuner M T, de Vries H, Kim B, Meili R C, Olmstead S H, Teleki S: Are electronic health records ready for genomic medicine? Genet Med 2009, 11(7):510-517.    [27] Brown S H, Lincoln M J, Groen P J, Kolodner R M: VistA—U.S. Department of Veterans Affairs national-scale HIS. Int J Med Inform 2003, 69(2-3):135-156.    [28] McGuire A L, Fisher R, Cusenza P, Hudson K, Rothstein M A, McGraw D, Matteson S, Glaser J, Henley D E: Confidentiality, privacy, and security of genetic and genomic test information in electronic health records: points to consider. Genet Med 2008, 10(7):495-499.    [29] Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, Tatusova T, Thomson N, Allen M J, Angiuoli S V et al: The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol 2008, 26(5):541-547.    [30] Mailman M D, Feolo M, Jin Y, Kimura M, Tryka K, Bagoutdinov R, Hao L, Kiang A, Paschall J, Phan L et al: The NCBI dbGaP database of genotypes and phenotypes. Nat Genet 2007, 39(10):1181-1186.    [31] National Center for Biotechnology Information, U.S. National Library of Medicine. “dbGaP.” 22 May 2013. <http://www.ncbi.nlm.nih.gov/gap>.    [32] National Institutes of Health. “Genetic Sequence Data Bank.” 15 Jun. 2013. <http://www.ncbi.nlm.nih.gov/genbank/statistics>.    [33] Amanda C: Integration of Genomic and Phenotypic Data. In: Data Analysis and Visualization in Genomics and Proteomics. Edited by Francisco Azuaje J D; 2005: 83-97.    [34] El-Ghatta S B, Clade T, Snyder J C: Integrating Clinical Trial Imaging Data Resources Using Service-Oriented Architecture and Grid Computing. Neuroinformatics.     [35] Rademacher J D, Lippke S: Dynamic online surveys and experiments with the free open-source software dynQuest. Behav Res Methods 2007, 39(3):415-426.    [36] Fegan G W, Lang T A: Could an open-source clinical trial data-management system be what we have all been looking for? PLoS Med 2008, 5(3):e6.    [37] National Center for Biotechnology Information, U.S. National Library of Medicine. “EGFR epidermal growth factor receptor [Homo sapiens (human)].    [38] National Center for Biotechnology Information. “Activating mutations in the epidermal growth factor receptor underlying responsiveness of non-small-cell lung cancer to gefitinib.” 20 May 2004. <http://www.ncbi.nlm.nih.gov/pubmed/15118073>.    [39] Couzin J: Pharmacogenomics. Cancer sharpshooters rely on DNA tests for a better aim. Science 2004, 305(5688):1222-1223.    [40] Mirhaji P, Zhu M, Vagnoni M, Bernstam E V, Zhang J, Smith J W: Ontology driven integration platform for clinical and translational research. BMC Bioinformatics 2009, 10 Suppl 2:S2.    [41] Murphy S N, Mendis M, Hackett K, Kuttan R, Pan W, Phillips L C, Gainer V, Berkowicz D, Glaser J P, Kohane I et al: Architecture of the open-source clinical research chart from Informatics for Integrating Biology and the Bedside. AMIA Annu Symp Proc 2007:548-552.    [42] Murphy S N, Mendis M E, Berkowitz D A, Kohane I, Chueh H C: Integration of clinical and genetic data in the i2b2 architecture. AMIA Annu Symp Proc 2006:1040.    [43] Weber G M, Murphy S N, McMurry A J, Macfadden D, Nigrin D J, Churchill S, Kohane I S: The Shared Health Research Information Network (SHRINE): a prototype federated query tool for clinical data repositories. J Am Med Inform Assoc 2009, 16(5):624-630.    [44] Cesareni G, Ceol A, Gavrila C, Palazzi L M, Persico M, Schneider M V: Comparative interactomics. FEBS Lett 2005, 579(8):1828-1833.    [45] Dennis G, Jr., Sherman B T, Hosack D A, Yang J, Gao W, Lane H C, Lempicki R A: DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome biology 2003, 4(5):P3.    [46] Zhang J D, Wiemann S: KEGGgraph: a graph approach to KEGG PATHWAY in R and bioconductor. Bioinformatics 2009, 25(11):1470-1471.    [47] Cerami E G, Bader G D, Gross B E, Sander C: cPath: open source software for collecting, storing, and querying biological pathways. BMC Bioinformatics 2006, 7:497.    [48] Joshi-Tope G, Gillespie M, Vastrik I, D'Eustachio P, Schmidt E, de Bono B, Jassal B, Gopinath G R, Wu G R, Matthews L et al: Reactome: a knowledgebase of biological pathways. Nucleic acids research 2005, 33(Data base issue):D428-432.    [49] Cerami E G, Gross B E, Demir E, Rodchenkov I, Babur O, Anwar N, Schultz N, Bader G D, Sander C: Pathway Commons, a web resource for biological pathway data. Nucleic acids research 2011, 39(Database issue):D685-690.    [50] Siest G, Marteau J B, Visvikis-Siest S: Personalized therapy and pharmacogenomics: future perspective. Pharmacogenomics 2009, 10(6):927-930.    [51] Hewett M, Oliver D E, Rubin D L, Easton K L, Stuart J M, Altman R B, Klein T E: PharmGKB: the Pharmacogenetics Knowledge Base. Nucleic Acids Res 2002, 30(1):163-165.    [52] Nadkarni, Prakash M., Randolph A. Miller. “Service-oriented Architecture in Medical Software: Promises and Perils.” 22 May 2013. <http://jamia.bmj.com/content/14/2/244.extract>.    [53] International Business Machines. “Service Oriented Architecture.” 22 May 2013. <http://www.ibm.com/soa>.    [54] Sun Microsystems. Java Technologies and Web Services Platforms White Paper. August 2005. 22 May 2013. <http://www.slgroup.com/Portals/O/docs/sample_docs/web_service_platform.pdf>.    [55] Research and Markets. “Services Oriented Architecture (SOA) Middleware Market Shares, Strategies, and Forecasts, Worldwide, 2013 to 2019.” April 2013. 22 May 2013.    [56] Glaser J P: Too far ahead of the IT curve? Harv Bus Rev 2007, 85(7-8):29-33, 190; discussion 136-199.    [57] Kawamoto K, Honey A, Rubin K: The HL7-OMG Healthcare Services Specification Project: motivation, methodology, and deliverables for enabling a semantically interoperable service-oriented architecture for healthcare. J Am Med Inform Assoc 2009, 16(6):874-881.    [58] Daskalakis S, Mantas J: The impact of SOA for achieving healthcare interoperability. An empirical investigation based on a hypothetical adoption. Methods Inf Med 2009, 48(2):190-195.
2. Description of Related Art
Information complexity is a major problem in biomedical research. Data resources are fragmented and scattered either in heterogeneous systems or different repositories. The different data formats and multiple access methods, combined with poor integration, make data access by researchers cumbersome. Furthermore, the inability to cope with the newly generated biomedical metadata is equally important, in addition to the lack of ability to translate information into a meaningful knowledge for discovery.
The impact of continuing research practices in the current traditional way (and not having a solution) is disappointing. Neither researchers nor ordinary systems will be able to muddle through the mounting data (signals, sequences, imaging, etc.) that are generated every day from a single patient. Most of the existing architectures in biomedicine are neither service-oriented, nor data exchange enabled. The current services and models lack common standards and tools of data integration and exchange. The integration of data from various decentralized clinical parties in one data warehouse has been a major challenge for Service Oriented Architecture (SOA). Currently, mobile virtual patient health record systems are utilizing services of SOA integrative architecture [1]. However, careful requirements analysis of the data integration process will result in the desired design [2].
Therefore, it is important to design a system that enables users to conduct personalized research in the context of a health information research exchange with the end in mind. Therefore, the main goal should be facilitating discovery in medical care practices. Apparently, the impact of not having such solution is a slow knowledge discovery rate.
In brief, current research in healthcare is fragmented, non-personalized and does not utilize patient's data over time. The data sources are heterogonous and operated by non-interoperable systems. Consequently, this drives the cost of healthcare research up and the quality of the conducted research down.
Discoveries in biomedicine are still trivial given the amount of generated metadata (i.e., “big data”) worldwide. Readiness is an issue in that the current biomedical knowledge mostly is not technology enabled for discovery. Knowledge discovery is defined as “the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data” [3]. Available knowledge discovery applications in biomedicine are either solely literature-focused [4], or confined merely to biology [5], bioinformatics [6], genomics [7], and/or just for management [8]. Further, current systems are often outdated [9], trifling [10], non-scalable [7, 11], and/or not contextually-oriented [12]. Lack of proper research exchange architecture for knowledge discovery in biomedicine is resultant from jumping to data analysis before thoughtfully thinking about the complexity of the design and the dimensions of information management in such a multidisciplinary environment.
Commercial medical informatics software and systems still have limited functionality and persistent problems. The software and systems continue to have a high per-seat cost, proprietary architecture, limited built-in functionality, and limited or no support for binary data. In addition, current commercial software and systems are unable to be customized for special purposes futuristically [13].
Current systems also lack personalization—it is known that every patient is different. However, the practice of generalization of results from research conducted on a group of people who are thought to be similar in some conditions is not good enough. Research practices should be transformed from result-inferring practice into direct conjecture of the case itself based on personalization. Whenever personalized research is conducted on a patient, a more accurate judgment of the use preventive, diagnostic, and therapeutic interventions will be achieved. There is a need to develop research-oriented applications that are patient-centered (personalized) and use contextual analysis in data mining to enable researchers to be more efficient in knowledge discovery.
Additionally, there is a need for an integrated lifelong (longitudinal) medical record to access clinical data anytime in patient life [14]. Similarly, there is also a need to have a long-term personalized research application that is able to follow patient related data over time. In today's research we lack time layering research-based applications. Indeed, assuring that time trends (time-sensitive) are taken into consideration will add unprecedented value to the future of healthcare research and biomedical knowledge discovery.
Moreover, there is a need for a real-time research-based exchange. Often times, there is a new therapy that is discovered and surprisingly many patients do not know about it. Therefore, physicians and researchers should be informed about such discoveries in order to recommend them to their patients, thereby creating the need for a real-time information exchange for discovery.