There are numerous areas of invention wherein multiple devices generate multiple data streams, and it is therefore desirable to utilize the data streams for the purpose of monitoring, analyzing and/or predicting behaviour that is time dependent. This time dependency, across multiple data streams can be difficult to resolve, part milady for the purposes of analysis, using existing data mining environments.
Faced with an exponential amount of data, many organizations are turning to data mining to translate data into information that can be utilized to generate subsequent knowledge. In particular distributed data mining, which refers to the mining of distributed data sets which are often stored in local databases and hosted by local computers connected through a network, is utilized in the prior art. Notably, many environments have different distributed sources of capacious data, the analysis of which requires data mining technology specific to distributed applications. Medical data is often distributed due to concerns of security, privacy and confidentiality of patient information. For these reasons, medical data is likely to maintain its distributed nature in the future. In distributed data mining, the data mining occurs both at a local level and a central level. At the global level, local data mining results are combined to discover global patterns or themes present in the data.
As an example of a situation wherein multiple data sources are utilized, intensive care units worldwide use a range of medical monitoring equipment, such as medical devices for life support and critical monitoring. Many of these devices have been operational for over 50 years. Although the devices themselves may have evolved over time, generally these devices enable critically ill medical and surgical patients to be observed and treated in a complex, specialized environment by physicians and nurses trained in restoring and/or maintaining the function of vital organs. A diverse range of such devices display physiological data and many have the ability to output this data via serial, USB or other ports.
In addition to the collection of this data for use in real-time by care providers, it is desirable to enable secondary analysis of the data for other related clinical research. For example, such secondary analysis can enable the discovery of previously unknown trends and patterns that may be indicative of the onset of some condition. The potential for such secondary use of health data is significant. In an American Medic al Informatics Association White Paper published in the Journal of the American Medical Informatics Association in 2007, entitled “Toward a National Framework for the Secondary Use of Health Data”, the urgency for infrastructures to support the secondary use of data in today's data intensive healthcare environment is characterized as pivotal to the U.S. Health system.
PCT Application No. PCT/CA2010/001148 discloses a multi-dimensional temporal abstraction and data mining technology, the method comprising: collecting and optionally cleaning multi-dimensional data, the multi-dimensional data including a plurality of data streams; temporally abstracting the multi-dimensional data; and relatively aligning the temporally abstracted multi-dimensional data based on an at least one time point of interest.
The work of Abdel-Rahman, Jeremic and Tan(2009) (cited below) test 3 two types of models namely, empirical Bayesian and autoregressive moving average to determine future state of the same stream. The present invention proposes a method to perform research to determine an association with an seemingly independent event from the streams and other entity data that is analyzed.
The work of Apiletti et al (2009) (cited below) proposes an approach for temporal analysis that does not support sub classification for classification based on entity attributes such as patient characteristics, as in the case of healthcare. Further, the research does not propose a platform to perform multiple studies.
Studies such as Krueger et al (2010) (cited below) use traditional signal processing techniques on physiological data streams to perform statistical analysis of the heart rate variability temporal feature as derived from the electrocardiogram (ECG) signal to confirm a notion of different patterns when grouped by age, however the temporal resultant features were not made explicit and are not translational for real-time observation.
A Data warehouse model for healthcare to support data mining is proposed by Lyman (2008) (cited below) however that model does not include the data model or data mining techniques for data streams such as physiological data streams or other data streams in healthcare.
Okascharoen et al (2007) (also cited below) propose a bedside predict on score for diagnosis of late-onset neonatal sepsis and in that work validate it with newly collected. While the score incorporates the assessment of some clinical conditions (apnea/bradycardia), these conditions are deemed as present through traditional electronic health record charting of occurrence rather than real-time temporal abstraction profiling of the physiological streams to better understand the temporal behaviours in the streams.
In Sharek (2006) (cited below) a NICU-focused tool for adverse event detection is proposed and tested. The adverse events relate to drug dosages. However the event detection is not via analysis of data streams but rather chart review. This invention enables the integration of drug information as data streams for example from the infusion pumps.
Verduijn et al (2007) (cited below) propose two temporal abstraction procedures for the extraction of meta features from medical data streams to enable the discovery of new abstractions or the use of abstractions from existing knowledge, however the method of extraction is not part of an overall architecture to support multiple studies and they focus on the proposition of specific approaches for both forms of temporal abstraction. In the present invention, temporal abstractions can be learned through exploratory mining for validation through explanatory mining or they can be defined by a domain expert for explanatory mining testing only. In addition, in the present invention, the data representation of the abstractions directly correlates with the manner with which these abstractions could then be observed in real-time for future real-time condition/event onset detection.
Zhang (2007) and Zhang and Szolovits (2008) (cited below) propose a method for patient-specific real time adaptive monitoring in critical care. In that work 8 hours of training data is required to train the model on the current state of the patient from which deviations can be detected. There was no automated systemic approach to data collection. A trained observer annotated data and a laptop computer was connected during the study windows to collect the data. The stream data was not assessed based on temporal features correlating to a rule set. In the present invention, a systemic approach to longitudinal multi-dimensional data stream capture is proposed and the assessment of the data is based on the construction of temporal features either as simple or complex temporal features.
Griffin and Moorman (2001) (cited below) propose a method for the early diagnosis of neonatal sepsis and sepsis-like illness using novel heart rate analysis. That method uses the analysis of ECG only and performs feature extraction based on the presence of heart rate variability. The method of extraction is not part of an overall architecture to support multiple studies. The method does not support multi-dimensional data analysis. The present invention proposes a method to perform a study such as that detailed. It proposed an approach to define the temporal abstractions that are the results from this study. It enables the completion of this study together with other studies. It supports a systemic approach for the collection of data streams and other static data to support the research.
The following include references that may be pertinent to the present invention, including references referred to above.    Abdel-Rahman, Y., Jeremic, A., & Tan, K. (2009). Neonate Heart Rate Prediction. 31st Annual International Conference of the IEEE EMBS (pp. 4695-4698). Minneapolis, Minn., USA; IEEE.    Apiletti, D., Barelis, E., Bruno, G., & Cerquitelli, T. (May 2009). Real-Time Analysis of Physiological Data to Support Medical Applications. Information Technology in Biomedicine, Vol. 13, No. 3, pg. 313-321.    Bjering, H., & McGregor, C. (2010). A Multi-dimensional Temporal Abstractive Data Mining Framework. Proc. 4th Australasian Workshop on Health Informatics and Knowledge Management (pp. Conferences in Research and Practice in Information Technology Vol. 108 pg. 29-38). Brisbane, Australia: Copyright © 2010, Australian Computer Society, Inc.    Blount, M., Ebling, M. R., Eklund, J. M., James, A. G., McGregor, C., Percival, N., et al. (2010). Real-Time Analysis for Intensive Care—Development and Deployment of the Artemis Analytic System. IEEE Engineering in Medicine and Biology Magazine, 110-118.    Catley, C., Smith, K., McGregor, C., & Tracy, M. (2009). Extending CRISP-DM to incorporate temporal data mining of multi-dimensional medical data streams: A neonatal intensive care unit case study. 22nd IEEE International Symposium on Computer-Based Medical Systems, 2009 (pp. 1-5). Albuquerque, N. Mex.: IEEE.    Catley, C., Smith, K., McGregor, C., James, A., & Eklund, J. M. (2010). A Framework to Model and Translate Clinical Rules to Support Complex Real-time Analysis of Physiological and Clinical Data. IHI '10. Arlington, Va., USA.: 2010 ACM.    Eklund, J. M., McGregor, C., & Smith, K. (2008). A Method for Physiological Data Transmission and Archiving to Support the Service of Critical Care Using DICOM and HL7. IEEE EMBS conference. Vancouver.    Griffin, P., & Moorman, R. (2001). Toward the early diagnosis of neonatal sepsis and sepsis-like illness using novel heart rate analysis. Pediatrics, vol. 107, no. 1, pp. 97-104.    Heath, J. (2006). A Framework for an Intellignent Decision Support System (IDSS) Including a Data Mining Methodology, for Fetal-Maternal Clinical Practice and Research. School of Computing and Mathematics. Sydney, University of Western Sydney, Australia,    Ho, T., Kawaski, S., Quang, L., Takabayashi, K., & Yokoi, H. (2004). Combining Temporal Abstraction and Data Mining to Study Hepatitis Data. SIG-KBS.    Holmes, H. J. (2007). Intelligent data analysis in biomedicine. Journal of Biomedical Informatics, 40: 605-608.    Kamaleswaran, R., McGregor, C., & Eklund, J. M. (2010). A Method for Clinical and Physiological Event Stream Processing. 32nd Annual International IEEE EMBS Conference (p. 4). Buenos Aires, Argentina: IEEE.    Krueger, C., van Oostrom, J. H., & Shuster, J. (2010). A longitudinal Description of Heart Rate Variability in 25-34-Week-Old Preterm Infants. Biological Research for Nursing, 11(3) 261-268.    Lyman J., S. K. (2008). The Development of Health Care Data Warehouses to Support Data Mining. Clin Lab Med, 28: 55-71.    McGregor, C. P. (July 2010), Patent No. 089705-0009. Canada, Gatineau Quebec.    McGregor, C., Purdy, M., & Kneale, B. (2005). Compression of XML Physiological Data Streams to Support Neonatal Intensive Care Unit Web Services. IEEE International Conference on e-Technology, e-Commerce, and e-Service (pp. 486-489). Hong Kong: IEEE.    Okascharoen, C., Hui, C., Caimie, J., Morris, A. M., & Kirpalani, H. (2007). External validation of bedside prediction score for diagnosis of late-onset neonatal sepsis. Journal of Perinatology, 496-501.    Sharek, P. J., Horbar, J. D., Mason, W., Bisarya, H., Thurm, C. W., Suresh, G., et al. (2006). Adverse Events in the Neonatal Intensive Care Unit: Development, Testing, and Findings of an NICU-Focused Trigger Tool to Identify Harm in North American NICUs. PEDIATRICS—Official Journal of the American Academy of Pediatrics, 1332-1340.    Stacey, M., McGregor, C., & al., e, (2007), An Architecture for Multi Dimensional Temporal Abstraction and its Application to Support Neonatal Intensive Care. Engineering in Medicine and Biology Society. IEEE/EMB.    Tong, C., Sharma, D., & Shadabi, F. (2008). A Multi-Agents Approach to Knowledge Discovery. IEEE/WIC/ACM conference.    Verduijin, M., Sacchi, L., Peek, N., Bellazzi, R., de Jonge, E., & de Mol B. (2007). temporal abstraction for feature extraction: A comparative case study in prediction from intensive care monitoring data. Artificial Intelligence in Medicine, 41: 1-12.    Zhang Y, & Szolovits, P. (2008). Patient-specific learning in real time for adaptive monitoring in critical care. Journal of Biomedical Informatics, 41: 452-460.    Zhang, Y. (2007). Real-time Development of Patient-specific Alarm Algorithms for critical care. IEEE EMBS conference.
There is a need for computer systems, methods and computer programs for execution on computer systems, that address the requirements mentioned above.