1. Field of the Invention
The present invention generally relates to computer systems and databases. More particularly, the present invention relates to a system and method for the gathering and analysis of health-care related data, and specifically the gathering and analysis of information regarding the use of pharmaceuticals by individuals. The present invention also relates to techniques for de-identifying the individuals from such pharmaceutical data, in order to maintain privacy.
2. Description of the Related Art
In the medical information field, pharmaceutical claims are processed on large computer systems which receive claims data for patients who have been prescribed one or more medications and have filed claims with insurance companies (or government entities) in order to have the claim paid by the company or entity. The claims data includes very specific details and attributes about the individuals making the claims. For example, attributes can include name, gender, birth date, address, medical diagnosis, specific drug prescribed, and other drugs the patient is using. Consequently, this data is very useful in assisting marketing research relative to usage of a specific drug and identifying various attributes that impact the usage.
The claims data is typically received at a data xe2x80x9cclearinghousexe2x80x9d which can be a database for a specific insurance company or a larger database providing the claim processing service for many insurance companies. Moreover, the claims data that are produced by claimants include a significant amount of data, with millions of new claims being entered into the system each month. Several of the claims data clearinghouses have systems handling many terabytes of claims data. Because of the large size of the data being produced and the large amount of attributes, the data is in an inadequate format for efficient search, retrieval and analysis of specific attributes.
Recently, there have been laws passed that prevent the transmission of personal information associated with individuals, within health care claims data. This legislation particularly prohibits the transfer of specific personal data such as names, addresses and social security numbers. Thus, the claims data is no longer allowed to be transmitted from the clearinghouse to others in raw form with the personal data. Without the personal information to segregate the claims data, it becomes much harder to generate valuable research and market data based upon the unique attributes for specific individuals, such as age, gender and geographic distribution.
It is therefore desirous to provide the ability to efficiently gather information from the claims databases to allow research and analysis of the attributes that effect the pharmaceutical industry. Accordingly, the present invention is primarily directed to systems and methods for overcoming the problems discussed above, as well as related limitations of the prior art.
In one embodiment, the present invention is directed to a system and method for creating a unique alias associated with an individual identified in a health care database, that allows the aggregation of segregated data for marketing research. The system may include a first data store for storing at least one record where each record has a plurality of identification fields, such as name and birth date, which when concatenated uniquely identify an individual, and at least one health care field corresponding to health care data associated with the individual, such as a medication type. The system may also have a second data store and a processor that selects a record of the first data store, selects a subset of the plurality of identification fields within the selected record, concatenates the selected subset of identification fields, and stores the concatenated identification fields in a record in the second data store along with at least one health care field from the selected record of the first data store. The first data store and the second data store can either be located within the same database or in separate databases.
The health care data stored within the first data store may, in one embodiment, correspond to pharmaceutical claims data. The selected subset may correspond to a specific person in the healthcare database, and the person""s last name, birthday, and gender are concatenated to form a unique identifier for that record. The processor may analyze longitudinal and historical records of individuals using individual-level linking methodologies based on the concatenated identification fields and the at least one health care field of each record of the second data store. The health care data also can have personal data removed from the various records such that only medically significant information remains, and the identifier allows the medical information to be segregated such that the individual records are still identifiable.
In order to more efficiently process the tremendous amount of data of the health care records, the processor may perform the further steps of selectively gathering the records from the first data store and selectively manipulating the records into a data cube. The records of the first data store are typically in tabular form, and the process of manipulating the records comprises selectively joining and projecting records from the various tabular records in the first data store to ultimately form a data cube comprised of a table of records. The data cube format allows the processor to more easily perform a search of the health care records, and also generate a report by displaying the records of a specific data cube.
The present invention thus provides a method for creating a unique alias associated with an individual identified in a health care database, wherein the health care database stores at least one record, and each record has a plurality of identification fields which when taken together uniquely identify an individual, and at least one health care field may correspond to health care data associated with the individual. The method includes the steps of selecting a record within the health care database, selecting a subset of the plurality of identification fields within the selected record, concatenating the selected subset of identification fields, and storing the concatenated identification fields in a record in a second database with the at least one health care field from the selected record of the first data store. The method preferably includes the step of analyzing longitudinal, historical records of individuals using individual-level linking methodologies based on the concatenated identification fields and the at least one health care field of each record of the second database.
The step of selecting a record within the health care database may comprise selecting a record from pharmaceutical claims data. Further, the step of concatenating the selected subset of identification fields may comprise, for example, concatenating, for a specific person in the healthcare database, that person""s last name, birthday, and gender. Thus, based on the concatenated identification fields and the at least one health care field of each record of the second data store, the method may include the step of analyzing longitudinal, historical records of individuals using individual-level linking methodologies.
As discussed above, the method further may include the steps of selectively gathering the records from the first data store, and selectively manipulating the records into a data cube. The step of selecting a record within the health care database may comprise selecting records of the first data store that are in tabular form, and the step of selectively manipulating the records into a data cube may comprise selectively joining and projecting records from the first data store and creating a data cube comprising a table of records.
The data cube allows the present system to aggregate the records in an efficient format such that all new records can be viewed shortly after posting. Further, the unique population identifiers allow users to follow patients over time yielding important results unavailable in other databases, such as patient drug switching behavior. By linking medical and pharmacy transactions at the patient level, new insights such as indication specific use of drugs and patient comorbidities can be determined.
The report displayed by the system may contain several attributes, such as: market shares geographic information at the national, regional, state and MSA levels; trends over time including annual, quarterly, monthly, and weekly periods; traditional measures such as total, new and refilled prescription counts; source of business such as new prescription starts, switches, and continuing patients; prescriber specialty; patient demographics for age and gender; indication specific use; and patient comorbidities. The system can therefore be used in a number of ways to help make business decisions, such as monitoring new drug launches and marketing campaigns, enhanced sales force targeting, and micro-marketing in select geographic areas or to select customers. Furthermore, the system can be used for forecasting and development of a pharmaceutical marketing strategy including indication-specific product positioning, early warning market share shifts, clinical trial site selection, investigator recruiting, and accurate intelligence on market size and demand.