Quality Talk
As you read this month’s Quality Talk, you may get the sense that Timothy Hofer, MD, MSc has been reading your mind. As research investigator at the Ann Arbor, MI, Veterans Affairs (VA), Hofer argues that many of the statistics health care organizations are forced to compile have dubious value. Much of his work takes place at the policy level with regulatory groups and payers.
If taken to heart, Hofer’s work could lead to less reporting by health care providers. In the future, it might enable health care providers to shift resources toward designing fewer and better measures of patient care practices. Meanwhile, Hofer’s insights below can help you simplify and improve the relevance of data you collect in your QI initiatives throughout the coming year.
Q. Some of your research challenges methods of quality measurement that many health care organizations rely on. Are there better ways to evaluate inpatient care or outpatient services?
A. If performance measurement is clumsily designed, it can drive providers to do the wrong thing. A nice case study of how you fall into this trap is diabetic screening for retinopathy. It shows how careful you need to be about structuring performance measurements.
For instance, with diabetics, almost the easiest way for providers to improve their performance is to get the most compliant patients to have eye exams every 10 to 11 months instead of every 13 to 15 months.
That is the group whose blood sugar and blood pressure are well-controlled and who may have a slightly-too-long interval between exams. This strategy will dramatically improve your rate of annual eye exams but have almost no effect on outcomes, because people with good blood pressure control have extremely low rates of developing retinopathy. Within that strategy, providers have almost no incentive to go after the patients with terrible glucose control who have never made it to an eye exam. Yet, that’s where the morbidity is. You want performance measures to guide providers in that direction.
Follow-up frequently needs to be increased
Even more ironical, we have some reason to believe that most blindness in diabetics occurs in patients who have already been diagnosed with retinopathy. They don’t need to be screened to make a diagnosis, but their frequency of follow-up is too low. Either they’re not getting to the ophthalmologist frequently enough or we don’t really know what the optimal frequency of follow-up is once you develop retinopathy.
Paradoxically, efforts to dramatically ramp up screening of people without retinopathy could increase blindness because swamped ophthalmology clinics could be less able to follow those with retinopathy. I should stress, however, that we’re not challenging the benefits of annual eye exams, but rather arguing for a targeted strategy of achieving those benefits.
Q. If health care organizations really want to measure their impact on people’s health instead of settling for measures of volume, how can they do it?
A. There are several points to make here:
- You need to stick to measures with clearly accepted and, preferably, experimentally established causal links between structure or process and outcome. That’s going to limit the comprehensiveness and number of potential measures. But I’d argue that’s a good thing because it’s going to enhance your ability to achieve your goals.
- Not all people get the same level of benefit even from interventions that meet the first criteria. Benefits dramatically differ across levels of risk. So the performance monitoring really has to be structured so the incentives reward provider efforts that result in the biggest benefit.
- You need to establish where the variability is in order to decide who to profile — provider groups, clinics, or hospitals. If there’s no variability at the physician level, for example, and all the variability is elsewhere, then you don’t need to profile individual physicians.
Q. What are some examples of causal links between what providers do for a patient and resulting outcomes?
A. Let’s go back to diabetes again. There are very clear causal links between glycemic control and blindness. There’s a body of experimental evidence that actually establishes a causal link between the two. However, we still have the issue of targeting the benefit, because it’s often presented as a mean benefit. But, in fact, the absolute benefit often varies dramatically across different levels of risk in populations. So when you’re trying to change care practices, you want to put your largest resources into getting the service to people who will get the greatest benefit from it.
Q. Does anybody in the health care industry measure service delivery according to targeted benefit as you describe?
A. Nobody is, frankly. But I would say HEDIS (Health Employer Data and Information Set published by the National Committee on Quality Assurance in Washington, DC) measures come closest. At least in their original form, the measures were limited to a few interventions in which cause and effect were clearly established.
Q. If the National Committee on Quality Assurance confined its measures to procedures with cause and effect links, wouldn’t it give us better data and lighten the reporting burden?
A. The market pressures for performance indicators has really forced them to overextend. But they’ve also started to reflect the targeting concept in their proposed measures of glycemic control for diabetes. They’re also considering some of the recommendations we made about eye exams. Nobody has really looked at identifying variations at the organization level. We’re trying to figure out that piece in our current studies.
Q. Are you saying that providers are caught between the dearth of valid outcome measures and the demand for performance measurement?
A. It’s my impression that, at this point, the big push to broaden profiling comes from health care payers who need to know that they’re not buying low cost, low quality care. In order for a competitive model of health care to work, they absolutely have to be able to do this.
They’re pushing the health care organizations and others in the business of profiling to provide models that are broader — and often more conjectural — than they should be. The focus on physician and provider profiling reflects an assumption by everybody that physicians are rugged individuals who have a large independent effect on what happens in health care.
Profiling is an effort to establish accountability at the "atomic" level where you can’t go any further. I think that’s what’s gotten us where we are right now.
Q. What classes of practice guidelines and quality measurements fail to measure what they’re supposed to measure?
A. That’s difficult to answer, in part because there are a lot of them out there. Among those that don’t really begin to measure what they’re supposed to measure are admission and mortality rates. They are still very commonly used; and they are completely and utterly meaningless in virtually all applications. A possible exception is bypass surgery.
Again, however, another problem in answering that question is that researchers and other data collectors do not always clearly express what a profile is supposed to measure. In our studies at the VA, we have principally focused on profiles of diabetes care. Take glycemic control as an example: Is poor control a measure of physician inaction? Is it a measure of disease severity? Or of a patient with more important priorities in their life?
Let’s suppose that a guideline is supposed to measure physician inaction. We found no variation in physician performance to suggest what’s causing poor control. The variation and solution are likely to lie in intervention across groups of patients or physicians. What do pharmacy drug costs measure? Are people supposed to be high or low? It’s hard to tell.
Q. Let’s look at the other side of this coin: What guidelines and quality measures are useful?
A. To a certain extent, usefulness is defined by the eye of the beholder. The current measures do allow employers to say to their employees that they offer insurance packages that only include HMOs whose HEDIS measures lie in the top 50% of all HMOs. That may be very useful as long as their employees accept that the measure is meaningful to them.
Patients might want to know about waiting times and the length of time before the next available appointment. Those things can be measured and can be quite useful. Some patients may prefer to choose doctors by what a friend tells them or according to which hospital is closest to them. They don’t need profiling measures for those decisions.
I can only speak to measures that we as health professionals would find useful in our efforts to improve health care quality. Another example from the population of 22,000 diabetics that we study at the VA: Some measures would be based on clear evidence and would be designed to focus providers’ attention on those who are likely to gain the most.
For instance, you can argue that blood pressure control is as important, if not more important, than tight glycemic control in diabetics. If a patient has high blood pressure and is on only one medication, that suggests a relatively easy intervention. For the person who’s already on three or four medications, it’s not completely clear how to get better control.
Q. Any further comments on how to assess the validity of guidelines and outcome measures?
A. In cases where you have a body of experimental literature that establishes the links between intervention and outcome, and everybody accepts the relationship, it’s relatively easy. All you need to show is that you can modify the process of care, and that’s probably enough.
In cases where that link is murkier, you really have to build an argument. Does the evidence suggest that there are causal links and that your intervention is likely to [have] a big effect on the outcome? That’s where people often get sloppy.
You have to build an epidemiological argument. The argument will stand on theory, biological plausibilities, the strength of the cause-effect relationship, the accumulated studies supporting the relationship, and whatever experimental evidence you can find.
Suggested reading
1. Hofer TP, Hayward RA, Greenfield S, et al. The unreliability of individual physician report cards’ for assessing the costs and quality of care of a chronic disease. JAMA 1999; 281:2,098-2,105.
2. Vijan S, Hofer TP, Hayward RA. Cost-utility of analysis of screening intervals for diabetic retinopathy in patients with type 2 diabetes mellitus. JAMA 2000; 283(7):889-896.
3. Thomas JW, Hofer TP. Accuracy of risk-adjusted mortality rate as a measure of hospital quality of care. Medical Care 1999; 37(1):83-92.
4. Hofer TP, Bernstein SJ, Hayward RA. Validating quality indicators for hospital care. Journal on Quality Improvement 1997; 23(9):455-467.
You have reached your article limit for the month. Subscribe now to access this article plus other member-only content.
- Award-winning Medical Content
- Latest Advances & Development in Medicine
- Unbiased Content