Clinical trials require the collection, storage, analysis and reporting of large quantities of data. Clinical trial data includes not only the observations of disease progression and treatment effectiveness required to validate a new drug, but also data such as subject demographic information, operational data, and records of adverse side effects.
Clinical trial data is generally collected as a series of case report ‘forms’. These forms are designed specifically for each study, based on the particular protocol(s) to be followed during the study. The case report forms specify the type of information, such as, for example, subject identification, physical measurements, test results, question and answer responses, etc., that are to be collected. These forms are typically filled out by, e.g. medical doctors, nurses, technicians, etc., at each subject visit or interaction.
Typically, different forms are designed to record different types of information. For example, a study protocol may specify a series of regularly scheduled subject visits, and, accordingly, a particular form for entering the data recorded from each visit, for each subject. Similarly, demographic information, such as subject age, ethnicity, gender, etc., may be recorded on a specific demographics form. In another example, an adverse events form may be used to record data related to any time a subject experiences an adverse side effect over the course of a study.
Recently, electronic data capture systems (EDC systems), such as Medidata Rave®, and Oracle® InForm, have been developed to provide a way to collect this clinical trial information electronically, rather than via paper forms. These systems allow for up-to-date forms, referred to as electronic case report forms (eCRF), for a particular study to be accessed and data to be entered into them electronically. The collected clinical trial data is thereby automatically stored in a database associated with the EDC System.
Once collected, however, the raw clinical trial data still needs to be analyzed and processed by a variety of stakeholders involved in the clinical trial sponsor organization (e.g. a pharmaceutical company), or otherwise associated with the sponsor organization. These stakeholders include a variety of personnel such as medical doctors, statisticians, and managers who are responsible for monitoring, analyzing, and reporting data collected over the course of the clinical trial. For example, medical doctors responsible for clinical development may need to review clinical trial data daily or weekly to assess drug efficacy and/or safety. Additionally, a sponsor organization may employ data scientists to carry out biostatistics analysis of results. In another example, stakeholders associated with pharmacovigilance monitoring must assess and report adverse event information to drug regulatory authorities.
Thus, clinical trial data is used by a variety of different stakeholders who perform a variety of different functions and, accordingly, may interact with different subsets of clinical trial data in different ways. In particular, stakeholders may need to analyze different subsets of data originating from one or more different types of eCRFs from one or more different clinical studies.
Thus, before beginning any advanced analysis of clinical trial data, stakeholders must perform a data consolidation process in order to extract and organize the particular clinical trial data that is relevant to their particular application. Portions of this clinical trial data may be spread across different eCRFs and/or different clinical studies. An integral part of this data consolidation process is the merging of different data sets retrieved from EDC, Clinical Trial Management Systems (CTMS), and Clinical Data Management Systems (CDMS) sources. Systems that retrieve clinical trial data from EDC (Electronic Data Capture) sources, CTMS (Clinical Trial Management Systems) sources, and/or CDMS (Clinical Data Management Systems) sources require the functionality to add, merge, modify, and process data from different datasets or one or more eCRFs from one or more clinical trial studies.
Providing systems and methods that are capable of addressing the need to merge this particular type of data obtained from different data sources pertaining to the clinical domain is non-trivial. Not only must such systems and methods enable data set merging in order to perform advanced analysis, but they must do so in a way that satisfies the diverse needs of a variety of stakeholders who may utilize several different systems for collecting, storing, and analyzing clinical data. In particular, the systems and methods for merging data should enhance the capabilities of existing clinical data collection or analysis systems, without requiring a change in their core functionality. Moreover, it is important that the systems and methods for merging clinical trial data are portable, such that they can readily be integrated within different clinical systems (e.g. different software applications that may be used to collect, store, and analyze clinical trial data). Approaches that are also platform-agnostic and can be therefore implemented and used with different platforms and technologies (e.g. different database systems, different types of computing devices) are also highly desirable. Pluggable architectures are advantageous, as they enable systems and methods that provide the ability to merge clinical trial data to be used within a variety of different client applications that may be familiar to different types of stakeholders.
Finally, there is a significant need for systems and methods for merging data sets that provide these capabilities in a user interface (UI) that does not require programming skills to use. Many stakeholders who work on or with clinical trial data do not have a background in programming and either must spend significant time and effort to accomplish data preparation tasks that require writing computer code, or rely on the support of programmers to prepare data before they can use it. Providing a powerful functionality to merge and manipulate clinical trial would enable many stakeholder who add significant value in aspects of clinical development such as reporting, analysis and decision making to perform their functions without facing a bottleneck in retrieving and preparing the data they use to accomplish their tasks.
Existing EDC and Clinical Data Management (CDMS) systems do not provide these capabilities. For example, systems such as Medidata Rave® capture, manage and provide patient data, but do not natively include functionality to merge data from two or more eCRFs. Similarly, systems such as Oracle® InForm also lack the functionality to merge two or more eCRFs (even if the eCRFs are from the same study). Finally, although commercial data integration systems provide functionalities to store and manage clinical data, including the ability to merge data from two data sets, they lack the requisite flexibility and user-friendliness described above. For example, current data integration systems are not pluggable to other systems and must be operated as stand-alone solutions.
Moreover, current systems do not provide an interactive user interface for defining complex data merging operations. Instead, for example, data merging processes in SAS® Clinical Data Integration System are defined by SAS® code. Accordingly, performing data merging operations requires a skilled SAS® programmer to write code. This forces stakeholders to either learn a complex programming language, or to rely on programmers in order to accomplish their data consolidation needs.
There exists, therefore, a need for systems and methods that provide a portable, platform-agnostic, pluggable data merging technology that enable a user merge data from different sources of clinical trial data in order to perform advanced analysis. Moreover, there is a need for systems and methods that provide, and enable a user to leverage this functionality without requiring the user to write complex computer code.