There has been an increasing interest in analyzing large, longitudinal data sets, including data collected from the routine operation of systems with which people interact over time (“real world data”), especially in the field of health care. For example, real world data, in contrast to clinical data from controlled, randomized trials, is analyzed to help evaluate safety, effectiveness and/or value of pharmaceuticals and medical devices. Such real world data can include medical records, insurance claims, prescription data and the like.
There are established principles from the fields of epidemiology and health research regarding performing analytics on longitudinal, real world data. For example, there are well-known statistical methods to apply to data to assess the safety, effectiveness and/or value of medical treatments or interventions (“treatments”, which include, but are not limited to, prescription and over-the-counter drugs, medical devices, and procedures). Generally, assessments compare a treatment to no treatment, while comparative assessments compare a treatment to another treatment. In particular, the main challenge in dealing with real world data rather than randomized trial data is that patients generally receive treatments not randomly but rather because they require treatment for a medical condition. Thus, it can be difficult to assess whether a treatment reduces risk of a medical outcome when the patient by definition is already at elevated risk for that outcome. Statistical adjustment techniques can help reduce such confounding effects.
To use these statistical techniques, complex queries are applied to the data to define different groups of patients, and to extract the patients' detailed data, which are then submitted to analytical software to evaluate safety, effectiveness or value. One challenge with applying these statistical methods to real world data is the performance of the computer system in accessing the data.
Beyond statistical and performance issues, investigators wishing to characterize the safety, effectiveness or value of treatments face challenges in assembling data for the inquiry, in maintaining and preserving data to provide for reproducibility of results over time, in providing a clear audit trail of how the inquiry was carried out and with what specific data and methodologies, and in executing inquiries that span multiple databases. Issues of reproducibility and auditability are of particular importance to government regulators and to insurance carriers and others who pay for medical services.