The performance of physicians in their daily clinical practices has become an area of intense public interest. Both patients and healthcare purchasers want more effective means of identifying good clinical care, and therefore a variety of organizations, such as the National Commission on Quality Assurance (NCQA) and Bridges to Excellence®, have developed recognition and pay-for-performance programs that reward physicians, hospitals, medical groups, and other healthcare providers for meeting certain performance criteria for quality and efficiency. These types of programs reward participating physicians who are categorized as “top performers” in diabetes care either with a particular fee to the physician for each diabetic patient covered by a participating health plan and/or employer or a recognition award. Physicians submit data for a sample of patients on specific performance measures, such as intermediate outcome measures (e.g., blood pressure levels), process measures (e.g., smoking cessation counseling), and patient experience measures (e.g., patient satisfaction). A physician's performance is calculated by awarding points for each measure achieved, summing the points to yield an overall score, which is then compared to specific levels of recognition and/or payment. While this scoring procedure is easy to calculate, the process to derive the performance criteria, specific point values, and overall performance benchmark by which physicians are being assessed is not based on a rigorous methodology and may be unintentionally misclassifying physician performance.
To assess the performance of physicians a set of evidence-based performance measures are used that pertain to the care of patients with a particular disease condition. For example, three intermediate outcome measures (hemoglobin A1c levels, lipid levels, and blood pressure at last visit) and four process measures (ophthalmologic examination, podiatry examination, nephropathy assessment, and smoking status or cessation advice/treatment) are used in many programs because they are supported by evidence-based guidelines established by the American Diabetes Association that describe ideal care for diabetic patients (American Diabetes Association, 2008). For each intermediate outcome measure, there is a minimally acceptable, evidence-based level of performance. For example, patients with low-density lipoprotein cholesterol (LDL) levels <100 mg/dl have superior control of their LDL. The performance rate for each measure is typically defined as the percent of a physician's patient panel that met the minimally acceptable level. For all process measures, performance rate is defined as the percent of a physician's patient panel that received the test/exam or counseling. Some programs also provide an optional set of patient experience measures (collected from a patient survey) that may be included as measures in the assessment.
A physician's performance is typically assessed by awarding a specific number of points (or scoring weight) for each measure if the measure's minimum performance criterion is met, but there is no specific methodology or system used to determine how the points are derived. For example, physicians may receive 5 points if at least 80% of their panel of diabetes patients received a podiatry examination. If less than 80% of their patients received a podiatry examination, then they are awarded 0 points. Some programs institute multiple benchmarks, or “tiers” of performance recognition (e.g., above average, very good, and exceptional performance) based on total number of points earned and these are used to determine the amount of compensation paid to the participating physician.
These old methods for assessing physician's performance in practice do not use a rigorous process based on established measurement principles to determine (1) the minimum performance criteria for individual measures, (2) importance of individual measures relative to one another (i.e., weighting or point value), and (3) an overall minimum performance standard or benchmark for managing a specific disease. Limitations with the method used to compute overall performance scores also exist; that is, awarding physicians all of the points allocated to a measure if they satisfy the measure's minimum criterion, and no points if they do not. First, the old method rewards the same number of points to physicians who just barely met the minimum performance criterion for an individual measure as to those who exceeded the minimum performance criterion by a significant amount. Second, when the minimum performance criteria are set very low the distribution of total points that physicians earn is quite skewed and distinguishing one physician from another is more difficult. Third, currently there are no measurement techniques consistently being used to assess whether the measurements that are obtained are meaningful in terms of their reliability and decision consistency of the performance benchmark. High reliability is essential in determining whether the method of measurement is fair, consistent, and accurate to be credible to the public. Reliability measures the proportion of true ability measured by the method rather than measurement error and can be computed for an individual measure or composite measures. High decision consistency is also critical since it judges how consistent are the decisions that are made about physician performance at a particular performance standard over many different samples of patients; the consistency of the standard (i.e., benchmark) should be high so that there are fewer false classifications (e.g., physicians who are incorrectly classified as providing good patient care).
Therefore, what are needed are systems and methods that overcome challenges found in the present state of the art, some of which are described above.