Go Big (Data) or Go Home

No Comments

reese-90In the rush to measure and regulate every aspect of the ED experience, it is tempting to use benchmarking data to compare individual EDs. But variations in practice make these kinds of granular comparisons troublesome. Fortunately, the EDBA’s big data has your back.

In the rush to measure and regulate every aspect of the ED experience, it is tempting to use benchmarking data to compare individual EDs. But variations in practice make these kinds of granular comparisons troublesome. Fortunately, the EDBA’s big data has your back.


Go Big Data W

The Emergency Department Benchmarking Alliance (EDBA) is a not for profit group whose mission is to support professionals who manage emergency departments. We enable networking, research, webinars and conferences, but our main contribution is to collect comparative performance data from EDs nationwide. Currently we have over 1100 EDs contributing data, and the numbers are growing.

Since we are dealing with a large collection of EDs, all of different sizes and with varying levels of IT expertise, the information we collect needs to be easily obtained, operationally relevant, helpful, and above all, based on clear, common sense data definitions so that the quality of the information is as high as possible.


We feel strongly about data quality. Every four years we sponsor a Performance Metrics and Benchmarking Consensus Conference, where we bring together thought leaders in EM and representatives of related organizations such as ACEP, AHRQ, ENA, AMA, AARP, and others. At our most recent conference earlier this year, one of the key outputs was a revision of data definitions for EM Process Metrics. The proceedings will be published, and will hopefully have an impact on CMS and the Joint Commission.

As we all know, CMS has mandated that several operational metrics related to ED performance be collected by all hospitals, and that these are publicly reported. These are, loosely speaking: door-to-doctor, decision-to-admit-to-left-ED, and LOS for discharged patients. Though this information is somewhat painful to report, the intent is a good one, and who among us will not agree that the sooner a patient is seen, or the less time they spend in the ED after admission, the better?

The issue, then, is not the data itself, but how it is used and interpreted. Because big data can be so powerful, it must not be used carelessly, and it must be used by people who understand its strengths and weaknesses. Major problems can arise when less informed decision makers accept the numbers as “truth”, without an appreciation of the limitations of the information.

The devil is in the details. The bigger the “cut” – the larger the dataset – the more likely our conclusions will be useful. At higher levels of magnification, however, especially at the individual ED level, comparisons become far more prone to error, and adjustment for these errors becomes more important.


Let’s think about door-to-doctor times for a moment. We have more than 4000 EDs in the country. Each one of them must report this metric. It consists of the difference between two time stamps: Arrival Time, and Provider Contact Time. Seems simple enough, right? But like most simple ideas, the closer you get to the real world, the more complicated and ambiguous things become.

If we randomly go to two hospitals—let’s say a rural ED with 20,000 visits, using a paper chart and a Level I trauma center with 110,000 visits, using an enterprise EMR—and look at how door-to-doctor is measured, will we see the same thing? We know for sure that their operating characteristics are different and their performance will likely be different, but how do we know the comparison numbers are a fair test; that is, does the compare have “face validity” as well as quantitative significance?

This is important in so many ways. It’s important for patients, who shouldn’t wait too long to be seen. It’s important for groups who want to keep their contracts. It’s important for hospitals that want to keep market share, to have a good reputation and please their board of directors. And eventually, most likely, there will be money attached to this and it will be important to the bottom line.

CMS provides regulatory guidance for how individual hospitals must measure these time stamps. Hospitals mostly report this on their own, but CMS will audit perhaps 5-10% to make sure people are in compliance, so there is some discipline to the system. But even within those regulations, substantial variability can exist.

Reporting can be electronic, via ADT/registration, or EMR, or other electronic tracking mechanism, in which case the sample is very large. Or it can be chart review, based on humans writing on physical charts, in which case the sample size is much smaller, though hopefully randomized. Is this a potential confounder? Absolutely.

Let’s go to another level of detail. How do different hospitals measure/define “Arrival Time”? Is a quick registration done by a clerk? Or by a nurse during the triage process? Do patients wait in line for some time before that registration, or do they sign themselves in either electronically or with paper and pen? Or is the time stamp created during full registration? Are these sources of potential error? Absolutely.

To yet another level, let’s look at the “Provider Contact” timestamp. Just what do we mean by “Provider”? And further, not to be too Clintonian, what do we mean by “Contact”? We all know the spirit of the metric would read something like this: How long before some sort of meaningful evaluation and interaction occurs between a physician or LIP who can do something to help the patient.
But let’s try to define that idea (and regulate it, as CMS must do).

Is it a Medical Screening Exam? Is it a regular full workup? Or is it a quick walk-by-Hi-are-you-sick? kind of thing. And once this has occurred, how is it documented? By hand, on paper, when you think about it? By assigning yourself in the EMR, when you think about it? If the provider is strongly incentivized to optimize this metric, is there a temptation to sometimes “optimize” that data entry? In some systems, assignment is automatic, by tracking system. Is that accurate all the time? Are all these potential confounders? Of course, they are. Big ones.
So if all this variability exists within this fairly simple concept, does that render the metric useless? Absolutely not. Within a very large dataset, individual differences fade to randomness and lose power. Trends across time in “populations” of EDs are fascinating and important for us as a specialty, and for the design of health policy.

But if we’re comparing your hospital to the big dataset, or even trickier, to another individual hospital, suddenly the situation becomes much more complicated. Imagine there is a relatively unsophisticated decision maker coming to you and asking why your performance is different from the competitor across town.

Once your individual or group’s reputation is on the line, you want as fair a comparison as possible. You want to find organizations that are as similar to yours as possible. This means adjusting for confounders such as size, severity mix, teaching vs. not teaching etc. This is one of EDBA’s key functions; our job is to allow apples-to-apples comparisons as much as possible, and time and again this function has proven invaluable to our members.

Has an individual, a group, or a system ever been injured by “apples-to-oranges” comparisons, or by some other misuse of data? No question, it has.
Even in the face of these issues, quality and process metrics continue to proliferate with enthusiasm, and will be more and more important to defining our success as health care providers. But regulation as a methodology to improve quality is suspect. And once big money is attached to performance (e.g. Pneumonia, STEMI), the metrics’ original benevolent purpose is obscured, and the process of improvement becomes dire and dogmatic. Despite little evidence that compliance actually improves outcomes, millions of dollars are wasted, unexpected side effects abound, and harm is even done.

To date, there is no financial reward attached to ED flow metrics, and the situation is still malleable. It’s our job, as representatives of our specialty, to recognize the strengths and weaknesses of publicly reported metrics, and to make sure that others understand the subtleties involved. If we can do this effectively, in groups and as individuals, we can hope to limit the spread of health care “regulopathy”, and the misuse of performance data, even further into the world of patient care.

Charles Reese, MD is the president of the ED Benchmarking Alliance 

Leave A Reply