Less Art, More Evidence – Challenging Physician Practice Variability

What some call “the art of medicine” I call an unacceptable level of physician practice variability. Electronic medical records now allow us to compare apples to apples and begin bringing over- and under-utilizers in line with the evidence.

In 1993, we developed a computer program that allowed all manner of clinical and administrative data to be entered and reported up in the process of doing ICD and CPT coding along with hospital charge capture.

Through use of this program, we were able to measure all sorts of time-related metrics by provider and also their utilization of tests and drugs and their performance of procedures. Every chart was data-stripped and, as such, it wasn’t necessary to defend the sampling process; there wasn’t one.

What we found was that physician practice patterns varied markedly, despite all of the physicians being board-certified in emergency medicine. Showing the physicians their data compared to their colleagues was somewhat helpful in narrowing variation, but large variances persisted. And variances were not related to patient satisfaction, complaints or lawsuits.
Was this phenomenon unique to our physicians? Unfortunately, for decades, EDs did not have the wherewithal – or the desire – to measure physician behavior in this manner.

But now, with EMRs and CPOE, the secrets are out. ED managers can easily compare apples with apples across the spectrum of physician behavior – from door-to-provider times to the use of CT scans for headache patients and endless other questions relating to productivity and utilization.

What will be found when this is done? Exactly what we found at our hospital; huge variations in care. What we found was that certain doctors were “testers” – no surprise to many of us – but the magnitude of variability is shocking. They ordered lots of all types of tests: Imaging, lab, cultures. You name it. Others were much more frugal and, in the process, were more efficient. Turn around times for discharged patients were substantially faster for the more frugal testers while the heavy testers had substantially longer turn around times.

We found substantial variations in the percentage of admission after adjusting for shifts worked and other potential confounders. Clearly, some of the physicians were substantially more risk averse than others. We don’t really worry about the patients we admit. We are more likely to worry about the patients we send home.

As anticipated, the variability in behavior (whether for discharged or admitted patients) was reflected in their bills. Here is where the perversity comes into play. The more tests ordered (or patients admitted), the higher the bills and the more the hospital would make (assuming the patients were not capitated or case-rate payors). So, depending on who the payor was, heavy utilizers and heavy admitters either made money for the hospital or lost money for the hospital.

Well, “the times, they are a changing.” With the rise of “Accountable Care Organizations,” the heavy testers and heavy admitters will be a rock around the neck of the organizations for which they practice. Also, it will become essential that managers scrutinize physician behavior and make attempts to narrow the huge variability that now exists.

Up until now, the measuring of individual physician behavior was viewed as an intrusion into what we called “The art of medicine.” Opponents would likely protest that they were being asked to do “Cookbook medicine” by following selected practice guidelines, bristling at the idea that “one size fits all.”

But, the examples of medicine being practiced without any supportive evidence are so extensive that the cottage industry of medicine, where 750,000 in the U.S. do it “Their way” has to change.

I would urge you to watch “Money & Medicine” at PBS.org. It is a compelling story comparing expenditures at UCLA vs. Intermountain Health in Salt Lake. It shows just what can be done when physicians take responsibility for the care being provided and make an organized effort to provide evidence-based care.

And now for some examples.

Here’s a great and courageous paper (congratulations to Dr. Prevedello and colleagues) pointing out the variability in utilization of head CTs at Brigham and Women’s Hospital in Boston. First of all, 8.9% (one in eleven) of every patient entering their ED got a head CT – unbelievable! The unadjusted rates of head CT scanning ranged from 4.4% to 16.9% (almost a four-fold difference) between the 38 physicians tracked. Even when adjusting for potential confounders, the variance was two-fold.

Check out the utilization of head CTs for headache patients – 15% to 62%. – Remarkable! Even after making adjustments, the rate varied three-fold. And most discouraging was that the level of physician experience (which was up to 30 years) did not correlate with head CT usage.

What are the poor residents learning from the behavior of their faculty? Probably that being thorough is very important, not missing anything is very important, you don’t want to get sued, patient bills really don’t matter and it’s “better to be safe than sorry.”

Are any residents assessed regarding the quality of their test utilization? It would be impossible given the behavior of the faculty. I am 100% confident that similar findings would occur at any other “teaching” hospital. Being a “Teaching hospital” appears to be an excuse for ordering pretty much whatever you want, and what do the residents learn? Emulate the risk-averse faculty.

VARIATION IN USE OF HEAD COMPUTED TOMOGRAPHY BY EMERGENCY PHYSICIANS
Prevedello, L.M., et al, Am J Med 125(4):356, April 2012

BACKGROUND: Medicare expenditures for high-cost imaging increased by an average of 17% per year from 2000 to 2006. Such imaging has been identified as one of the key drivers of increasing healthcare costs. Rates of head CT scanning have been noted to vary substantially between individual facilities, but the degree of variation between physicians within a single institution has not been explored.

METHODS: This study, from Brigham & Women’s Hospital in Boston, examined variability between 38 emergency physicians (29% female, post-residency practice 0-30 years) in rates of requests for head CT scanning in 2009.

RESULTS: Head CT scanning was performed in 8.9% of the ED visits overall (4,919 of 55,281 visits), and tended to be more frequent in males, older patients, those presenting with head trauma, and patients with more urgent presentations. Unadjusted rates of scanning varied between physicians from 4.4% to 16.9%. After adjustment for potential confounders, requests for head CT scanning varied between physicians by about two-fold (6.5-13.5%). Similar patterns were observed in the subgroup of patients with atraumatic headaches, in whom rates of head CT scanning ordered by physicians ranged from 15.2% to 61.7%. After adjustment for confounders, there was a nearly three-fold variation between physicians in head CT scanning in this patient subgroup (21.2-60.1%). Differences in rates of head CT scanning were not influenced by physician gender or level of experience.

CONCLUSIONS: This study demonstrates su
bstantial variability between emergency physicians in a single institution in the use of head CT scanning, and the importance of identifying methods to decrease this variability and promote the appropriate use of imaging.

Here’s another paper describing the variability in two tertiary pediatric EDs in the same city. At one of the EDs, admission rates varied three-fold among the full-time providers while it varied eight-fold at the other. Imaging rates, use of IV fluids and antibiotics varied two to three-fold depending on the ED. Bottom line – even in the narrow field of pediatric EM, variation in practice is distressingly large.

PHYSICIAN PRACTICE VARIATION IN THE PEDIATRIC EMERGENCY DEPARTMENT AND ITS IMPACT ON RESOURCE USE AND QUALITY OF CARE
Jain, S., et al, Ped Emerg Care 26(12):902, December 2010

BACKGROUND: Previous studies of variation in medical practice have generally compared practice patterns in different geographical regions and between physicians with different levels of training.

METHODS: This study, from Emory University in Atlanta, compared the care provided in three resource categories by attending physicians practicing in the EDs of two tertiary care children’s hospitals in the same city (one academic and one nonacademic). The study included 163,669 visits managed by 36 physicians in ED1 and 289,199 visits managed by 45 physicians in ED2 in 2003 through 2006.

RESULTS: The hospital admission rate was 13.3% in ED1 and 12.1% in ED2, but there was a nearly three-fold variation in admission rates between physicians practicing in ED1 and an eight-fold variation in ED2. For patients discharged from the ED, utilization of laboratory tests was relatively similar. Although overall imaging rates were similar (14.4% vs. 15.4%), there was a two-fold variation between physicians in ED2. The overall use of IV fluids and antibiotics for discharged patients was about 7% in both EDs but the interindividual rates varied two-fold in ED1 and three-fold in ED2. The mean length of stay was longer at ED2 by 10 minutes, and there was more variation between physicians in this facility than in ED1. Rates of return ED visits at 72 hours were similar in the two EDs. In ED1, there was a two- to three-fold difference between physicians in the minimum and maximum observed-to-expected ratios in the utilization of imaging, lab testing and IV therapy, and higher than expected utilization was associated with a longer than expected ED length of stay.

CONCLUSIONS: This study demonstrates substantial differences in resource utilization between ED physicians in similar practice settings.

So, it is time to bite the bullet. Physicians’ practices need to be accurately and fairly measured, and variations in care need to be defended or rectified. Studies show that the best way to narrow variability is to engage the physicians in developing or adopting the practice guidelines they are expected to follow. They don’t want to be handed a protocol in which they had no say. And start out by adopting practices that everyone can agree upon. Introduce new practices by providing authoritative sources for recommendations. Don’t create a system to find the “bad apples,” but rather move the group as a whole. Measure performance periodically and give authority-based feedback (there must be consequences for those who choose not to or cannot comply), and finally, consider rewards and incentives to help narrow variability.

(lead photo by Mark Chadwick)

2 Comments

OneFootOutTheDoor on November 28, 2012 8:50 am

I fully support the use of good criteria that help physicians make good decisions. The Nexus study has been very helpful. The New Orleans and Canadian Head CT criteria are as well. Things like the PERC rule, while not perfect, is also a good tool.

The problem is that medical practice brings with it a wide variability in patients and practice environments. A physician is not going to interact with an overweight patient on disability who has a track record of noncompliance, workmen’s compensation, and litigiousness in the same way that he will interact with a pillar of society who takes care of his health.

Similarly, a physician practicing in Pennsylvania or the hometown of Boston Legal cannot be expected to react – or order things – in the same way that someone in Texas would.

In all this is the matter of flow. In the ideal world of the ABEM oral boards we all see patients, perform physical exams, and place our orders. In the real world, we see rooms 1 and 2 while simultaneously ordering the chest x-ray, CBC and lytes on the patient with a productive cough in room 3, the UA and UPT on the female of childbearing age with abdominal pain, all the while watching more bodies pile up in the waiting room.

The problem with our health care system is not so much that it is expensive, but that it is responsibility-free and tort driven. One lawsuit tells us more about how to practice in the real world than the whole body of decision rules and medical school professors combined.

Now, we will have the worst of all worlds – the national health system of England and the legal system of America waiting to prey upon us when this system delivers substandard care.

Anyone care to go diving with sharks without a cage?

John B. Sullivan, Jr, MD on November 28, 2012 2:36 pm

This has all been tried in the past with failure. Art is far more important to patient care than evidence. The electronic medical record will fail to constrain costs because it produces no individuality of the person and will lead to more and successful malpractice litigation. Teach evidence and teach art. All the evidence in the world will not cure a patient who has no confidence in a physician who cannot relate or empathize.

2 Comments

Leave A Reply Cancel Reply