LRG demonstrated a receiver operating characteristic (ROC) area under the curve of 0.97 (95% confidence interval 0.93 to 1.0). ROC curves are a graphical representation of sensitivity and specificity where 1.0 is perfect discrimination. Therefore, an ROC area under the curve of 0.97 means that LRG was excellent in discriminating cases of appendicitis.
Shortly following the online publication of the article on June 23, 2009, several news outlets released stories about the study. Most notably, Time Magazine reported that, “…if these promising results can be replicated, it could mean a dramatic improvement in patient care and a reduction in medical costs.” So how should frontline emergency physicians react to this potential advance in the evaluation of appendicitis? The answer is that we should be excited, but not yet ecstatic.
Cause for Enthusiasm
There are many reasons for enthusiasm. If LRG is ultimately validated in subsequent studies of suspected pediatric appendicitis, it may have the potential to change clinical practice. When appendicitis is the only potentially serious diagnosis considered, a negative urine LRG test may mean no labs, no ultrasound or CT, and no call to a surgeon. A valid and reliable urine test for appendicitis may reduce exposing patients to unnecessary testing such as CT scans – which in of themselves may be associated with higher cancer risk. It may also reduce the rate of false positive appendectomy if a negative LRG test can reliably exclude disease in equivocal cases. Because delays in operative management are associated with a higher rate of perforation, it also may reduce the rupture rate as a simple test may rapidly identify true positives.
Another benefit may be a reduction in ED visits for children with potential appendicitis, especially if the test can be adapted so for use in pediatric office practice. It may even reduce ED length of stay for patients and overall costs of caring for populations of children with suspected appendicitis as many of the usual steps in taking care of these patients can be avoided.
Another source of enthusiasm is the potential to use the same biomarker discovery method to find new, non-invasive ways to improve diagnostic accuracy for the variety of diseases in emergency care. The discovery phase was a fishing expedition for candidate biomarkers, only requiring samples from six cases and six controls. Given the small numbers of cases required to find novel proteins to differentiate disease versus non-disease, it makes the replication of this methodology in other disease entities, even rare ones, seem very feasible.
The authors approach to the validation phase was methodologically sound. They tested a population of patients in whom the test would be applied in clinical practice: those with suspected appendicitis. For the outcome measure, all samples were assessed by a clinical pathologist who was blinded to the outcome. Tests in urine specimens in the prospective cohort were also blinded to the final outcome. They performed 6- to 8-week follow-up in children who did not have appendectomies to ensure that no children ultimately went on to require appendectomies. Importantly, they were actually able to contact 100% of the enrolled patient population. According to the criteria from the Centre for Health Evidence on diagnostic testing, they did everything right.
Words of Caution
It almost seems too good to be true. Based on these results, can we move forward and start requesting LRG tests on pediatric patients with suspected appendicitis? Before we break out the champagne, we must approach any potential change in the standard of emergency care with caution.
The authors listed several limitations. First, LRG has only been validated in a small cohort of children in one hospital. Further testing in larger cohorts in a variety of settings will be needed before this test can be used in clinical practice. The median duration of symptoms was two days. How this test will perform in cases of early or late appendicitis is unknown. In addition, because mass spectrometry automatically has an internal correction for urine concentration, it may be a challenge to develop immunoassays or dipstick tests for practical use that demonstrate the same test characteristics with urine of variable concentrations.
Finally, with any test, it is important to determine appropriate cut-offs for what constitutes a positive and negative test. If you dust off your epidemiology book from medical school, you will remember that for “rule-out” tests, maximizing sensitivity (true positives/false negatives) is desired (SN-OUT). By comparison, for “rule-in” tests, it is important to maximize specificity (true negatives/false positives) (SP-IN). Since LRG would most likely be used as a “rule-out” test, a cut-off would likely be chosen where sensitivity is maximized.
The authors did not directly report sensitivity in their paper, appropriately so because they did not propose a specific cut-off for a positive test. However, it can be determined from the shape of their ROC curve that a sensitivity of 100% was reached at a specificity of 70%. At that particular cut-off, 3 in 10 children had false positive test results. A specificity of 70% could potentially lead to additional testing that may not have been needed, especially if the test is used liberally in a population with low disease prevalence. A similar phenomenon can be seen in D-Dimer testing for pulmonary embolism (PE) which has similar test characteristics (very high sensitivity/low specificity) and has bred the liberal use of PE-protocol chest CT in low-risk populations with positive D-Dimers.
In particular, they found that LRG is present in the urine of children with pyelonephritis. This may make differentiating appendicitis from pyelonephritis even more of a challenge, especially in young febrile females who have trouble communicating their symptoms with both a positive LRG and positive leukocyte esterase or nitrite.
But regardless of the potential limitations this new test for pediatric appendicitis, we should view the proteomic discovery approach for non-invasive tests as an important advance in the science of emergency care. It has the potential to provide us with highly sensitive tests that may obviate advanced radiography in large populations of ED patients, especially if highly sensitive, highly specific tests can be developed.
For LRG, the authors estimated that a commercially available test is up to three years away and studies are planned in larger cohorts of both pediatric and adult patients. So ultimately, how LRG will change ED practice for suspected appendicitis still remains to be seen. But despite potential limitations, emergency physicians should be cautiously optimistic for new biomarkers such as LRG to revolutionize the practice of emergency care.
Jesse M. Pines, MD, MBA, MSCE is a practicing emergency physician and an Assistant Professor of Emergency Medicine and Epidemiology at the University of Pennsylvania. He is an author of a recent book, “Evidence-Based Emergency Care: Diagnostic Testing and Clinical Decision Rules.”