1. Field of the Invention
The present invention relates to a system for portable digital data capture and data distribution. More particularly, the present invention is related to an object-oriented computerized data capture and data distribution system that is employed to construct a portable digital data capture project from component objects and the like, to forward subsets of the project for the purpose of registering transactions, and to persistently save, recall, reconcile, and share the project. A real-world digital data capture and distribution project consists of four major components. First is the definition of the project: what data is to be captured, how input is to be made, what are the protocols for acceptable input, and how it is to be captured. Second is the logistics management of the project. This includes the functions and relationships for information such as: users with access to the project, user assignment details, forwarding project components to the field, managing changes and additions to the project, report distribution, and the like. Third is the capture of the data, which is conducted mostly by mobile personnel. Fourth is the design, development and distribution of reports on the data collected.
2. Background
A typical data capture and data distribution system employed in healthcare industry contexts uses a piecemeal software approach with many steps, each of which involves complex, redundant, and subjective human interaction on many trivial aspects of the process. In the definition stage of the project, text based forms for data entry are manually designed, developed and distributed in the field. The content of the forms and criteria for each data field in the forms are designed and then developed into a template, using software (word processor, spreadsheet, form maker, etc.). These forms may or may not have individual field restrictions for data entry. The software determines the format of these templates. The software with which the template was developed and the template itself are e-mailed or hand delivered and then loaded onto a device with memory and an operating system. In the data capture stage, a human completes the work according to their interpretation of the data collection criteria and protocol. The human then enters input into the text-based forms as a response. After a user makes input into the template, saves it with a file name, then distributes it by hand delivery, mail, e-mail or fax machine, responses are separately gathered from the field workers and data is entered into a database for storage, analysis, and development into reports. Reports on these responses are then separately designed, developed and distributed. These reports are distributed by hand delivery, mail, e-mail or fax machine. The text-based forms and the resulting response data are generally stored in any combination of the following: word-processing documents, spreadsheet documents, e-mails, and paper files, which may or may not have links to external databases. These external databases supply some data that the human user refers to when entering data into the text-based form. During each step of the data capture and distribution project using this system, many components of the project involve the management of logistics. Text-based forms are organized, copied and distributed, distribution of and changes to forms are tracked and the return of forms is managed. In addition, a manager checks for correctness of input by users; requests, manages, distributes and tracks what specific materials are to be reviewed or measured; distributes information on where these materials are located; etc. Even though some software and technology has been applied to data capture and data distribution, there are many opportunities for human error because the conventional system requires numerous human interactions with the project. Such errors include but are not limited to a loss of forms, distribution of out of date forms and then the collection of data with these forms, redundancies of assignments, misfiling, and data entry errors.
An important component of data capture and data distribution of the present invention is managing the logistics of data collection assignments. Logistics include but are not limited to the definition of an assignment, the tracking and managing of form creation/distribution, organizing the deployment of personnel and materials involved in a data capture project and distribution of assignments. Currently, a system for managing the logistics uses a separate piecemeal software approach to the problem as well. Logistics data is generally stored in any combination of the following: word-processing documents, spreadsheet documents, e-mails, and paper files, which may or may not have links to external databases. These external databases supply lists that are used for different components of logistics, such as assignments. The content of an assignment is predetermined. However, the format of assignments is determined by the software used (word processor, spreadsheet, form maker, etc.). E-mail or hand-delivery distributes the software with which the assignment format was developed, along with the assignment data. Managers who input and track assignment data access the software. Investigators in the field are distributed assignment data that pertains to them and refer to the assignment data, which is input in the data collection template. Any management or tracking of reports on the status of work assigned or the field workers assigned are separately developed, aggregated and then distributed by hand delivery, e-mail or fax machine as well.
To understand the conventional system and to visualize how a computerized system of the present invention has tremendous commercial value, an example of a pharmaceutical company's data capture and data distribution project is presented for the purposes of explanation. A pharmaceutical company (also referred to herein as “pharma”) designs the templates and the protocols for input on each item that is to be measured or reviewed by clinical trial investigators during real-world data capture and data distribution in a Phase IV clinical trial. A data collection template that requires design, development, and distribution consists of the following items: question text; spaces to input responses; blanks for entry of demographic information about the review assignment (investigator name, address of the review, medical record number of the record being reviewed, date of the review, etc.) and directions about how to complete and save the template. To create a template, all of these components are developed into a word processing software or spreadsheet software template for data entry. Investigators receive the platform-dependent software from which the template was developed by hand delivery or e-mail. This software is loaded onto a specific hardware platform with memory, such as a laptop computer. In addition to the software, investigators receive by e-mail, mail, fax or hand delivery, and the text-based template for input of responses. Investigators also receive a separate list of medical record numbers of patients whose data is collected for the trial, along with a list of the doctors' names and addresses, whose patient records are to be reviewed as a part of this clinical trial. The data on these lists, which comes from external databases, are data entered by the investigator into the template as text response. When the investigator makes all inputs, the template is saved with a file name, then mailed, e-mailed, hand-delivered or faxed to pharma headquarters. At pharma headquarters, input is data entered into an external database and reports on these inputs are designed, developed and distributed by mail, e-mail, hand-delivery or fax.
In the example of a Phase IV clinical trial, assignment data consists of numerous items that all require definition, organization, management and tracking. These items include but are not limited to the name of the person requesting the review and the date of the request; the type of template to be used; the name of the doctor(s) and/or patient(s) record(s); and the complete demographic data, including suite number. Additional items include the investigator assigned and the date scheduled for the review; the completion date; and investigator or manager comments. To manage assignments, a template is developed for data entry of these items into word processing software or spreadsheet software. Investigators receive by hand-delivery or e-mail, the platform-dependent software from which the assignment templates were developed. This software is loaded onto a specific hardware platform with memory such as a laptop. Investigators refer to the assignment data that pertains to them and input part of the assignment data as a response in the data capture template. Investigators may also receive separate lists of medical record numbers along with separate lists of the doctors' names and addresses. The data on these lists are to be input into the templates as a separate response. When all inputs are made, the investigator inputs the date of completion into the assignment template, saves it with a file name, then e-mails, hand-delivers or faxes to pharma headquarters. At pharma headquarters, assignment data is removed from the template and entered into an external database. Mail, e-mail, hand-delivery or fax distributes management and tracking reports, prepared separately.
Such an approach has the advantage of supporting uniformity of data capture and assignment templates among users who are given the same templates. The software automates the scoring process, thus eliminating calculation errors. Also, additional data entry can be eliminated if responses can be imported from the software with which they were developed into an external database. In a system implemented with great attention to hardware and software version compatibility, data files created with one version of the software can be viewed by other versions of the software regardless of the hardware platform. This approach works well if questions, response choices, scoring, users, hardware, and assignments rarely change and quality assurance is performed on each of the data files. Quality assurance will ensure that data entered in responses are in accordance with predetermined criteria and that data entered from information on lists from external databases have been copied exactly, eliminating what appears as a duplicate entry but is actually not a duplicate entry.
A system such as described above lacks the speed, sophistication and flexibility to distribute, track, organize and manage changes or additions made to any or all of the components of the data capture/data distribution process at the same time without disturbing any aspect of the process. Components of a data capture and distribution process, such as logistics information management and data capture templates, are modified and added often, in order to fulfill complex data capture and data distribution needs. For example, during a clinical trial, investigators' inputs into templates may reveal that a change in the protocols for the templates is required. These changes to templates must be developed and then distributed quickly to all staff as required. Furthermore, multi-platform multi-software, and multi-version software support is a serious burden for the producer of the templates, the manager of the project logistics, the investigators making inputs, the staff creating and distributing reports, and the individuals to which the reports are distributed. All required reports are designed and developed as a separate component to the project.
To properly conduct data capture and data distribution, in healthcare or otherwise, a CDCDS must not just present separate templates for data entry of text responses and text assignment requests, which can be imported into external databases. Developing templates for each change to a data capture and distribution project is costly, time consuming, and labor intensive. Managing the logistics of distributing changes to data capture templates and assignment data to the appropriate personnel done with the above system is replete with errors and redundancies. Developing and distributing reports on the data collected is labor intensive, time intensive and programming intensive.
Data capture and distribution must be considered as a whole, from the design and development of project definition, to the data capture and the management of logistics for the project, and all the way through to distribution of reports. The data structures and the flow of information must support all of these components together and must remove redundant and trivial tasks, thereby streamlining and automating the process. There must be a complete representation of the relationships between all the components of the project. For example, the system used during a data capture and distribution project must incorporate the protocols for how input is made, what data is referenced as part of the inputs, what reports are needed, and what parts of the project are to be distributed to whom. In conjunction with these needs, there must be a complete representation and visualization of the relationships between the logistics components of the project. This would help prevent overlap in a situation where a site audit is being conducted. For example, if the manager can see that there are two doctors at the same site, then there is no need to do a site review of both doctors who share the site. In the conventional system, items are entered into a database separately without the ability to visualize other relationships between items in the database. Doctor A at 123 Main St. of the “Temple Medical Practice” will not be seen as grouped with Doctor B at 123 Main St. of “Temple Medicine.” All of the project components and any changes/addition to the project components during the completion of a data capture and distribution project need to be speedily communicated to all designated parties. However, the most efficient strategy for organizing and storing data for capture and distribution does not just relate to the text alone but also relates to the properties, relationships, functions of and message with each of the components and to any of the parties involved.
The data types expressed in data capture and data distribution with the conventional piece-meal software approach vary widely between software (spreadsheets, word processing, file maker, etc). Therefore, it is not practical to express all of the possible combinations of data types within the software programs that are part of this system. Data from differing software may be simultaneously required in arbitrary combinations by a user. Therefore, multiple unrelated software specific tools cannot be employed. Human interaction is required to manually review and match the project requirements with staffing, information, and reporting needs.
CDCDS Requirements
A CDCDS must solve the problems described above by providing flexible programming tools that allow a user, having domain-specific expertise to develop programs and data structures into “schemas” relevant to any such domain data capture and data distribution project requirements. For example, in a phase IV clinical trial of a diabetic medication, certain programmed domain-specific components (“objects”) will be integrated in a schema to capture information on dates of medication orders, and the information on test results. If a phase IV clinical trial were to be conducted on a biomedical device, different objects would be developed whereby the objects representing barcode data capture or device specific data from biomedical hardware can be integrated into a project. A user of the CDCDS employs one or more such schemas that can be combined and integrated in arbitrary combinations in conjunction with a single project. The users must be able to customize the combination of objects and their relationships and functions without additional programming. A user with project expertise is responsible for the identification of the objects and the relationship to other objects in an environment. A CDCDS must provide the ability to mark objects with certain functions specific to the project and mark the messages that will be passed between objects. A CDCDS must also reveal to users a visual representation of relationships in the project, in order to fully manage the flow of information and automate the organization and management of the logistics of a project. For example, in a Phase IV trial, investigators will receive a project subset forwarded to them for data input. The input made by investigators in this project subset is then reconciled with the project. During reconciliation, the investigator's project subset will be changed, reflecting updates made to the project by other users. An example of an update would be a change in protocols for the clinical trial. This, in turn, will affect changes in the data capture project as a whole, and these changes need to be forwarded to other users.
To accomplish such goals, the CDCDS must address the following concerns:
a. Data Portability and Longevity
In large organizations, groups involved in portable data capture and data/report distribution often work on multiple different types of mixed hardware and operating system configurations (“platforms”). Moreover, the life cycles of a project will often exceed the lifetime of one or more of such platforms. Accordingly, it is essential that CDCDS data that originates on one platform be useable on any other platform without translation. As a result, the CDCDS does not constrain the otherwise natural progression to the most cost effective computer systems. Furthermore, a project defined by such CDCDS data can be archived and reactivated years later on a new platform without any loss of integrity. Similarly, the type, the meaning, and the flow of the information in a project can change dramatically throughout the project life cycle. For example, the project has been changed to include signature capture during the data capture and distribution project because changes in industry regulations now requires this type of data capture. Or a question response type needs to be changed from a yes/no response choice to a yes/no/NA response choice because investigators reported the need for the additional response category after initial data capture in the field. It must therefore be possible to refine and revise the schemas that are used by the project (i.e., allow for “schema evolution”) without jeopardizing the integrity of the previously created CDCDS data.
b. Data Integrity
A CDCDS stores valuable information. However, the value of the information can only be secure if the data capture and distribution project created by the program is standardized, reliable and accessible. To ensure that the data in a CDCDS project maintains internal consistency, it is necessary that such data always be accessed and modified by the same schemas that defined and created the project. It is therefore essential that schemas be easily accessed and ubiquitous with respect to the CDCDS project. Moreover, a CDCDS must minimize the need to produce and distribute copies of the CDCDS project. When multiple copies of the same project exist, any individual copy stands a greater chance of being rendered partially or wholly obsolete. For example, many investigators will access a data capture project, such as a phase IV clinical trial to input data at a remote site, during a review of medical records. In addition, managers will be adding assignment requests to the project and researchers will be modifying project protocols for the clinical trial. These changes need to be made without interrupting the workflow or the flow of information for any of the users. Another example of the need to ensure data integrity is when assignment requests from managers need to be forwarded to the appropriate investigators during the clinical trial. This data needs to be forwarded as part of a project subset. Simply supplying investigators with a blank field for data entry of assignment demographics and an assignment list does not guarantee that the data is consistent for the manager requesting assignments and for the investigator inputting data several different times at a remote site. Errors are rampant when a human user copies input between components of a project. When a report on this data is supplied, these data entry errors skew project results.
c. Data Accuracy
A CDCDS aids in the capture and distribution of data, whose accuracy is very important to an organization. For example, in a clinical trial, the Federal Drug Administration (FDA) monitors data very closely for correct or missing input. In order to reduce input error and thus ensure greater compliance with input protocols, a CDCDS must allow the researcher to incorporate a level of “intelligence”, including the complex logic of protocols, within a project. This intelligence will restrict the user from inputting data that is not in accordance with protocols, or will prompt the user to choose a correct input. The logic programmed into a project may even supply input in response to prior input. Such complex logic in a project must go beyond the conventional systems' ability to restrict data entry by programming a field-input mask. A CDCDS must allow users to customize the design and development of projects that will advance to, skip over and complete input according to the protocols that have been programmed in the project, without further investigator input. For example, in the clinical trial, when the patient birth date is entered, the CDCDS-generated project will automatically input N/A wherever input does not pertain to that age range. Multiple protocols must be able to be developed into a tool and changes to protocols must be distributed easily. In addition, a CDCDS must guard against users twice entering what appears to be duplicate data to the user, yet is not an exact duplicate. For example, 123 Main St. is not a duplicate of 123 Main St (no period after St). In a clinical trial, patient records from these two addresses will not be collated together. Thus, repetitive information will be included in the project, making the project results inaccurate.
d. Large and Complex Data Sets
The size of a typical CDCDS project can be quite large and complex and the project is often accessed using mobile hardware, which may have limited memory capacity. For example, a clinical trial project may require hundreds of inputs. The protocols programmed into a project may be complicated. In addition, complex logic that streamlines the workflow during data capture must be developed into the project. Additional inputs may be required based on previous input; inputs may be automatically repeated based on prior inputs; or an entirely new set of inputs must be made because of the previous input. The CDCDS must handle such large and complex projects efficiently and forward to the investigators only that subset of the project the investigator is working with. Investigators depend on the ability to access the project quickly and input the data quickly. The amount of information in a data capture project cannot be limited in a preset manner.
e. Aggregate Data Across Different Projects
Data captured by users on different real-world projects must be aggregated for the purpose of complex analysis of the data. For example, a tool that is used to capture data for a clinical trial of an asthma medication may contain data that must be referred to for a clinical trial of a cardiac medication. These clinical trial projects are often managed and performed by different users. In addition, the projects may refer to different schema programs. For example, one project captures specific data types, text, and bar code data while another project captures signatures. These very different projects need to be able to refer to each other. However, the user may not know the relationship between projects when a project is designed. The organization of the components of the projects and the data must allow for sharing between projects without the need for complex forensic analysis of the data tables and additional database programming to incorporate the two projects or share data between projects.
f. Many Simultaneous Users
A CDCDS project is typically shared simultaneously by many users within an organization. In a Phase IV clinical trial, the managers, staff development, investigators, and medical directors will be involved in a shared CDCDS project. Some users require access for querying and inspecting inputs only, but others need access to add to or modify the project. Accordingly, the CDCDS must ensure that changes to the project are properly coordinated and that the project is kept in a consistent state at all times.
g. Many Simultaneous Schemas
Data capture and data distribution projects typically involve collaboration among several disciplines, each being represented by one or more schemas. A CDCDS is expected to facilitate the integration of the information created by each of the departments to allow easy and consistent access to users in other departments. Therefore, a CDCDS must store and manipulate information defined by multiple schemas simultaneously. Further, it must be possible for one schema to reference information defined in and maintained by another schema within the data capture and distribution project.
h. Flexibility and Extensibility
A programmer, with the help of a project designer, typically refines a CDCDS to meet the changing needs of the user. Additionally, a CDCDS is refined by the end user to include user-defined extensions. Since every user has different requirements, the ability to customize the system “in the field” is essential. Project components change often and rapidly. A CDCDS must accommodate the user in making rapid changes to the project while tracking and managing the new project information so as to make it immediately accessible to the users in the field. In addition, as project components are added and manipulated by the users, the CDCDS must allow for the cross-referencing between project components for the purpose of viewing the evolution of the project and for viewing relationships between data capture and distribution projects. For example, in a clinical trial, the scoring methods, protocols and the response choices may change from project to project. It is valuable to track statistical similarities and the validity of data captured as part of a particular project.
i. Performance
Data capture and data distribution projects are characterized by complicated data sets that are accessible by users who are away from the office. Yet users demand speed and convenience when accessing a project. A CDCDS must be able to organize and store data such that access time to the data is optimized. For example, users at remote locations need access to changes in protocols without returning to the office and without interruptions in their data capture.
j. Ease of Use
A CDCDS user is presumed to be expert in a particular type of data capture. For example, in a diabetes project for Phase IV clinical trials, the user is knowledgeable about the disease state of diabetes and the design and development of this type of project. However, she is not necessarily a sophisticated computer user and is not likely to be willing to invest valuable time in extensive training. Furthermore, since multiple users from different departments will employ the same CDCDS, the expertise of the users will vary widely. Accordingly, use of a CDCDS must be simple, intuitive and familiar.
CDCDS Implementation
A successful CDCDS must incorporate a robust environment for programmers to implement schemas, must provide an easy-to-use environment for users to employ those schemas on real-world data capture and distribution projects, and must be easy to use in the field. Accordingly, the CDCDS implementation must include at least the following elements:
a. Schema Environment
Schemas must contain all necessary information to display, manipulate, revise, and query any data capture and distribution project. There cannot be any application-specific expertise built into the schema itself. Schemas must be portable so they can execute on any platform that the CDCDS can execute. Schemas must also be inseparable from the project, must be flexible and expandable without requiring the original source code for recompilation, and must execute efficiently. Due to the size of hand-held hardware (the optimal choice for users working in the field) and the complexity of a project, the routines that process this information must do so in an efficient manner. Schemas must also be able to evolve over time such that they can be revised and extended as new requirements arise.
b. Application Framework
In order to manipulate the schema objects for the development of a project, the objects must be presented to the user in a familiar and easy-to-use environment, or “application framework.” The user interface programs must be portable across all platforms on which the CDCDS runs so those users can choose among appropriate platforms. However, the application framework itself must interact with the Native Operating System on which the framework executes. Such interaction must be transparent to the user.
c. Visualization of Data Relationships
In order to get the efficiency, speed and standardization of a CDCDS and reduce the amount of data capture needed to accomplish the goals of a project, the user must be able to visualize the relationships between all components of a project. Users must be able to easily visualize the overlap, redundancies and duplication in the project. This will prevent error, in a data capture project and thus increase the speed of the project. For example, in the phase IV clinical trial, a data capture tool has been accessed and input made during a medical record review for Doctor Marcus Welby at 123 Main Street, Small Town USA. A different staff member already completed a review for Marcus Welby MD at 123 Main Street, Small Town USA. Ordinarily, without visualization of the relationship between these two assignments, the second review for this doctor would be performed. There would be no way to visualize the redundancy because his name appears as a different name. The only way a user would verify a similarity is to look up the data by doctor and by site and compare these two entries. With relationships between data tables and queries organized to visually reveal all relationships in an assignment, a manager requesting the assignment will immediately see the redundancy and can take steps to correct it. Additional efficiencies, other than detecting the above error, can be experienced with the ability to visualize relationships. For example, in the same clinical trial, a review is completed for Doctor Welby that includes capturing data on compliance with facility safety regulations. On a different date, the same review is to be performed for a different doctor at the same facility. Ordinarily, without visualization of the relationship between these two assignments, which reveals overlap of the review of the facility, a new assignment would be requested and duplicate data will be captured. Additional problems may occur for the project. For example, a duplicate review for the same facility conducted by a different staff member may result in a different score for this facility. Although duplicate reviews are sometimes conducted for inter-rater reliability (work comparisons) between reviewers, an unknown duplicate review with a different score will foul the entire clinical trial. With the ability to visualize the duplication of assignments, a manager can choose to accept the duplication or not.
d. Distributed Components
To help prevent data obsolescence, a CDCDS must allow for having a certain subset of the project distributed out to users in the field. At the same time, a “live connection” to the distributed portion must be maintained in each project subset where it is referenced.
e. Tool Persistence
State information for project components must be maintained across editing sessions. Accordingly, objects must be dynamically reinstated each time the CDCDS is used to forward, view or manipulate the project.
f. Synchronous Data Reconciliation
When an object in a project is changed, other objects in the project may change as a result. A CDCDS must reveal relationships between objects so that those changes to objects downstream may be executed. If these changes result in an invalid or inconsistent project, the changes do not affect the data that has been captured. Multiple users throughout the life of a project access a CDCDS project on a real-time basis. Time delays must not be caused by the requirement to perform reconciliation between differing data sets that have been created by users. Certain sets of users access the CDCDS project to modify design and integration or to change other components of the project, such as logistics data. Other users access the project to input data. Each set of users will need to be updated with only the specific changes that affect their aspect of work in the project. The project will have changed many times and users in the field will need to receive these updates without having their workflow interrupted and without having to return to a central location for synchronization. This reconciliation of data in a CDCDS project must occur while other users access the project. Users, who are accessing and changing the design of the project, must be able to reconcile with users in the field so that all users have access to the latest changes without complex file naming conventions or without distribution of multiple versions of a CDCDS project. For example, in the clinical trial, certain users will be accessing the CDCDS project to input data, while other users access the CDCDS project to make changes in the criteria. Both users must be able to perform their tasks without loss of data or version control problems.
g. CDCDS Logic
A CDCDS project needs to aid users in the capture of data that is very important to an organization. These users have domain specific expertise, and they usually access a CDCDS project that pertains to their expertise. For example, in a clinical trial, nursing staff will input nursing data, while pharmacists will input pharmaceutical data. In certain cases, a machine such, as a medical device, will input body temperature readings. Users often need to remember input protocols, and protocols change often. For example in a clinical trial, specific protocols require an input if a patient is an adolescent. In addition, other protocols will require an input if an adolescent is a patient between the ages of 13-17. In order for inputs to be made correctly, an investigator must view the birth date of the patient, calculate the age of the patient, recall the protocol requirements and then make the appropriate input. A CDCDS project must aid the user by automatically referencing the designated protocol for input, then input automatically (i.e. enter a value automatically) and then reveal only the remaining input requirements. This complex logic in a CDCDS project needs to be flexible enough to change as the protocol changes. Since the CDCDS project automatically enters input, the user moves more quickly through the work and incorrect inputs are reduced. In addition, a data capture project may contain specific process protocols. For example, in a clinical trial, skipped input is not accepted. Input must be chosen from a limited list of choices. These process protocols must be programmed as part of the CDCDS project, which must be flexible and extensible enough to include changes to the process protocols that are immediately accessible to the users.
h. Project Management
A CDCDS must maintain the integrity of all project components. Accordingly, mechanisms are required to: lock portions of the components to regulate multi-user access; control revision access; create and manage parallel development to the same project; and prompt users who access the project to follow the logical requirements of a data capture project. In addition, changes from multiple users on the same project need to merge. A permanent identification of specific versions of constituent projects as contributing to a particular state of the project is required, and access to the database according to graduated security levels needs to be regulated.
The present invention comprises a computerized data capture and distribution system (“CDCDS”) that electronically organizes all the components of a data capture and distribution project for design, analysis, manipulation, simulation, visualization, integration, decomposition, storage, retrieval and reporting. The present invention is highly suited for any environment where data is captured and distributed from/between remote locations or by mobile workers and reports are generated from the data. This invention would be useful in projects such as clinical trials, and pharmaceutical “detailing”; sales management; auditing sites, records, or inventory; conducting surveys; enrollment; and inputting and surveying data in medical records.
To address the requirements discussed above, the preferred embodiment of the present invention includes an object-oriented schema implementation programming language, a compiler, a linker, a run-time system, an object-oriented data transport schema, and a project database with data tables in specific relationships. The programming language is based on C++(although Java and XML objects can be used) and is employed to write schema programs that are compiled using the compiler. The output of the compiler is an object file. The linker combines a group of object files into an executable program that is portable across many platforms, where each platform has a run-time environment tailored to that platform. The run-time environment contains only the absolute necessities to execute the application on that platform. Each program may also be a shared library. If so, the program exports references to these classes, functions, and variables. Other programs can have references to these classes, functions, and variables resolved at run-time. A program may both import and export references. That is, the program may use other shared libraries and may serve as a shared library for other clients. The object-oriented data transport schema is based on C++(although Java, XML or other object-code can be used) programming language. The transport schema implements the various objects that are integrated into a data capture project, their functions, and how these objects are to function under specific transport circumstances. Specific relationships between tables in the project database are employed to allow visualization of data redundancy, overlap and errors. In addition, these table relationships are employed to visualize the shared commonality between items in any tables.
The present invention includes schemas for computerized creation, management, distribution and reporting of a portable data capture and data/report distribution project. The present invention also includes schemas for forwarding project subsets to workers in the field. During said transport, the versions of the project that are created or changed by users in the field are reconciled with the project database. In addition, specific data table relationships allow visualization of data that is entered and accessed for the creation, organization, tracking, management, and reporting of the logistics of a data capture and distribution project, along with any additional components.
The schema programs, the transport programs, and the specific data table relationships create, manage, distribute and report project components for a predetermined domain. Such domains include pharmaceutical, healthcare, insurance and other industries. The schemas represented by the schema programs represent multiple classes. Each defines a data type that can be placed in a CDCDS project, and defines how that object will interact with or affect other objects of the project. Objects or instances are created from each class as each object is placed in the project, marked for a specific use in the project, and marked for a specific type of reconciliation during data transport. This includes specifying the data variables, and the program code used to manipulate the variables.
Objects are stored in one or more repositories or “stores.” Related stores are grouped into a “data capture and distribution” project which relates to a real-world project in healthcare or a real-world project in other industries. The CDCDS manages and stores any or all projects in a project database, on a networked server, with dial-up access so that multiple users both in the office at a desk and out in the field with mobile hardware can be given concurrent access.
First, all objects, their functions and how they will relate in a data capture and distribution project are added to the project database. The project database lists all the objects that are currently programmed in schemas, and which can be integrated as a project. The user then starts a session for the purpose of creating a data capture and data distribution project. The user will choose objects to integrate into a project and mark those objects with their functions and how these objects will relate to other objects. In addition, the user will mark how the functions of objects will relate to the functions of other objects. The following explanation describes how a user without programming skills, but who understands the nature of data capture and distribution projects in healthcare, creates a “project.” The user accesses a set of forms, queries and macros written in Visual Basic language in the project database. These forms and queries restrict data entry in tables in a specific order and within certain parameters so that the project created in this manner meets the requirements of a real-world data capture and distribution project. Since a real-world project includes the management of users and differing levels of user access to the project, these forms, queries and macros help the user, who is creating the project, to set up the access requirements. The initial user, in addition to setting up access permissions, designates a second set of users—the field workers—to receive a forwarded project database. This allows the user who creates the CDCDS to mark objects for integration into a project and then to create entirely new projects by simply changing the relatedness of the objects. This eliminates the need to build entirely new projects from scratch.
Secondly, the users in the field begin a user session by executing a query of the project database to extract the subset of the project (for example, a number of related objects marked for their use) from the project database into a local database. The format of objects in the project database and in the local database is often different, so translation is necessary. This extraction is a long-term transaction to the project database. The user will have no further interaction with the project database during the user session. Changes or additions can be made to the project objects during an editing session by the first user, as well as by the field workers. These changes and additions can be posted to the project database at the end of a user session. Conflicts are reconciled by the transport schema that has been given designated functions and is carried out when users communicate with the project database at the end of a session.
Objects in a project database are defined and interpreted by the combination of instance data and class methods. Therefore, instance data cannot be interpreted without the related schema that corresponds to it. To maintain integrity of the project data, it must never be possible to encounter any instance without the corresponding schema. Due to this constraint, the CDCDS treats the programs that comprise a schema as a component of the project database, as with the instance data and the project components. In this manner, whenever an instance of a class is created in a project database, the schema of that created instance is also copied into the database. Thus, whenever instances of the class are extracted in future sessions, the schema is loaded into memory from the project database. The architecture is modular so that new data types can be easily added by modifying the field type object. Because the CDCDS is object-based, information can also be shared with other object-based programs by publishing appropriate interfaces. These facts are important since many programs across an organization refer to data captured and distributed during a real-world project. In addition, specifically arranging data in tables in the project database visually reveals relationships between items in the database that ordinarily seem unrelated. This visualization allows the users to instantly see the components of a project in their relationship to other components. A user can see information that helps them make decisions about the project management. These specific relationships between data in tables can be used by any database format.