Field of the Invention
This invention relates to methods, systems, and apparatuses for data processing and management, and more particularly to a configurable framework for processing parallel analytics.
Summary of the Invention
The healthcare industry is under a high level of scrutiny to reduce overall costs and improve quality. Critical to these improvements is the automated evaluation of insurance coverage, conformity of the services compared to the best practices, opportunities for prevention, quality of care measurements and managed care interventions and detection of fraudulent or incorrect claims. Unfortunately, the healthcare industry is faced with missing or incorrect clinical or demographic information as well as an enormous volume of data for patients, dependents, and the claims for their care.
A large health plan with 34 million members can typically have over two billion claims over a three year period. Because of missing or incorrect information, this data must be reprocessed several times before it meets business requirements. Historical solutions have had difficulties completing several rounds of processing within a month. Reducing the processing times of this automation may (1) help reduce the accounts receivable backlogs of payers and providers, enabling them to be more financially secure, (2) more readily find incorrect or fraudulent claims in an environment of decreasing Medicare reimbursements, (3) enable earlier proactive managed care interventions to keep patients healthy, avoiding the much higher costs of an emergency room visit or costly complication later, (4) increasing other business process velocity by providing quality results more quickly, and more efficiently. Though the systems, methods, and apparatuses disclosed herein may be used to process healthcare claims and similar data, the disclosure is not so limited—any datasets, healthcare or otherwise, may be received and processed.
Prior art solutions have been developed to tackle this problem. For example, U.S. Pat. No. 7,650,331 describes a “System and Method for Efficient Large-Scale Data Processing” to Dean et al. The systems and methods described here are based on a “map/reduce” programming model and an associated implementation for processing and generating large datasets. The Dean disclosure describes completely subdividing input data into several map tasks and subsequently assigning those tasks to various processes. Such initial subdivision of all work tasks before assignment to various process can be both time consuming and resource intensive.