Quality
Adjustment Using Average Utilities: Virtue or Vice?
J.
Jaime Caro, McGill University, Montreal, Quebec, Canada and
Krista F. Huybrechts, Caro Research Institute, Concord, MA, USA
Based on the
workshop presented at ISPOR 9th Annual International Meeting,
May 16th- 19th 2004, Arlington, VA, USA.
Problem Statement
A major challenge in economic analyses has been to find a way to
aggregate the disparate health effects into a single measure
that can be used comparatively and can be combined with the
costs into a single criterion of economic efficiency. In order
for such a measure to be of use, the units of value need to be
1) comprehensive (i.e., applicable to all the health
consequences one wants to value); 2) universal (i.e., not
personally defined); and 3) constant (i.e., interval scale so
that equal intervals on the scale have an equivalent
interpretation). Although money is assumed to have all of the
desired properties and has been embraced as the measure to value
consequences in all other areas of economics (environmental
economics is a salient example), health economists have rejected
it, perhaps under misguided “ethical” considerations. Instead,
they have pursued utilities and their derivatives to value the
spectrum of consequences.
Utilities
A (von Neumann-Morgenstern) utility is a measure of the
attractiveness of a consequence of a decision being
contemplated; with the measurement being made on a scale where 0
represents the worst and 1 the best possible consequence of that
decision. For example, a consequence with a utility of 0.7 is
considered 30% worse than the best consequence in the sense that
the decision-maker is indifferent between the consequence
assessed and a 70% chance of attaining the best result. Properly
measured utilities are constant (i.e., have interval scale
properties). They were defined, however, in terms of the
possible consequences of a given decision and for a particular
decision-maker – neither all possible consequences for many
decision, nor group decisions were contemplated [1].
Problems arise
when trying to use utilities for group decisions that must
consider a variety of consequences across multiple decisions.
Problems arise when trying to use
utilities for group decisions that must consider a variety of
consequences across multiple decisions. To try to retain the
properties of the utility scale, yet attain the desirable
“comprehensive” property as well, the proponents must define
“universally applicable” anchors that cover all possible
lifetime health paths. But, this is an impossible task: what do
the worst and best possible lifetime health paths correspond to?
Are they truly immediate death and “optimal” or “perfect”
health? How is optimal health defined: normal good health for a
given age; free of all disease, symptoms and dysfunction; health
as good as you can imagine it? There is plenty of evidence that
death is not the worst consequence. Moreover, “worst” and “best”
are inherently subjective, making it necessary to specify whose
opinion is to define the anchors. Although these theoretical
failings already seem to doom utilities as the measure of value
(a fact recognized decades ago [1]), if our field insists on
continuing to misuse them for this purpose, then, at a minimum
we must see to it that useable definitions of these anchors are
agreed upon, possibly with input from organizations such as the
WHO.
Even with universal anchors
agreed to by all, the problem of applying these for group
decisions arises. Under a strict utilitarian approach, the
“societal utility” (a concept that is not part of the original
theory) is obtained by aggregating the utilities from each of
its members. A straightforward mean utility, however, does not
necessarily maximize welfare or address fairly the needs of the
society and there is no consensus on how individual utilities
should be combined to form a social utility function. Even if it
were possible to agree upon a universal weighting scheme, once
you use such a weighting scheme the aggregate is no longer a
utility in the sense that it loses the properties that made the
concept attractive in the first place. The only sure way to
resolve this is to resort to dictatorship, with the autocrat
deciding on the utility that will apply for the society.
Although this works for the military and other such groups, it
seems objectionable for health care decisions in society. Is the
common practice of using some experts' opinions to value health
consequences really that different, however?
Aside from the practical problems
encountered in measuring preferences for various lifetime paths,
it seems that, on theoretical grounds alone, utilities have to
be rejected as the measure of value - they are appropriate for
what they were designed: individual decisions with a limited set
of consequences relevant to the alternatives contemplated.
Quality Adjusted Life
Years
The most prominent contender for a solution to these problems
has been the construct of weighted average survival, which was
introduced in the 1960s and early 1970s. It conceptualizes the
health consequences as having only two dimensions: duration of
life and quality. The survival time is adjusted according to the
quality of that time under the assumption that time and quality
are completely exchangeable. Using quality weights attached to
each of the health states results in the quality adjusted life
year (QALY). It should be noted that contrary to many
assertions, the technique used for quality assessment does not
have to be preference-based - it can be any relative scale of
quality. More important, perhaps, even if a preference-based
method is used (i.e., “utility” is used as a proxy for quality),
the QALY itself is not a utility and does not benefit from the
utility properties.
QALYs do not cover several
factors known to be determinants of societal value. For example,
there is no additional value given in the QALY to the severity
of the initial condition. According to the QALY model, value is
proportional to the number of people receiving the benefit and
to the increase they obtain in quality or duration of life and
not at all to the degree of need met, to fairness in
distribution of resources, and so on. These choices in
constructing the measure were made 25 years ago, not on the
basis of empirical studies showing the extent to which societies
value different aspects of health care, but rather based on the
judgment that value should be measured in terms of the amount of
health produced in the population [2]. This assumption of
“distributive neutrality” - epitomized in the use of QALY league
tables - is increasingly recognized as untenable [3]. As
expressed by McGregor, a QALY gained through correction of
erectile dysfunction in an otherwise healthy individual would
not be considered by most as equivalent to a QALY gained through
life-prolonging dialysis in an individual about to die from
renal failure [4]. To ignore this and other differences in the
societal value of the QALY could seriously mislead policy
decisions.
This flaw in the QALY’s basic
structure is sufficient to argue against its use, yet there are
other major problems with the measure. In the relentless pursuit
of a comprehensive measure, the outcomes of interventions that
only affect ‘quality’ of life (e.g., pain relievers) are
reported in units of ‘length’ of life, namely life years
(quality-adjusted) - a misleading translocation. In addition,
the quality weights are measured by various techniques and it
has been shown that the results can vary widely according to the
method used. Even when the same investigators use the same
methods, the repeatability of quality estimates both within and
between studies can be very poor, casting doubt on the
reliability of the QALY - a major negative for a measure that
seeks to be universal and comprehensive. Indeed, quality
estimates vary greatly according to who is making the estimate
(e.g., patient, family, health professional, general public) and
there is no agreed upon theoretical basis for selecting the
viewpoint to be used when making societal policy decisions [4].
Like utilities, the QALY lacks
the properties required for the measure of value.
Path Forward?
If we cannot use utilities or QALYs, how then should we value
health outcomes? We believe that we must re-examine the basis
for rejection of money as the unit of measurement. The
challenges that we tend to consider insurmountable - and
therefore sufficient justification for rejection of the approach
- such as putting a monetary value to lost life, have been
surmounted by other economic sub-disciplines where cost-benefit
analysis has been standard practice for decades. Secondly, it
seems to us that the Cost-Value method described and promoted by
Eric Nord offers an interesting alternative [4]. This approach
goes back to the idea of a barter economy, be it in an
allegorical sense, and advocates direct outcome valuation in
terms of person trade-offs. That is, valuation is based on the
number of people obtaining one kind of outcome that would be
regarded as equivalent to a given number of people obtaining
another kind of outcome. As such it establishes a person
tradeoff at the value side that is comparable to the person
trade-off at the production side, directly incorporating
distributional concerns. In our view, probably the most
important contribution of this approach - which admittedly has
its own problems - lies in the fact that an explicit attempt is
made to offer a better, more theoretically sound alternative to
the utility/ QALY approach, rather than just accepting the
current flawed approach as standard practice under the pretext
that despite its weaknesses it’s good enough!
s part of its strategic
goal to foster excellent and innovative methodology [5], we
believe ISPOR has a role in actively encouraging research in
both of these areas. In the meantime, it seems we will have to
return to reporting health outcomes in natural units appropriate
for the particular intervention being assessed, and leave it up
to the decision-makers to decide on their preferences for the
use of resources without the aid of a ‘universal’ costutility
ratio. Although this may be a more difficult task, the trade-off
they are making will at least be transparent, both to them and
to those who live the consequences; society as a whole.
REFERENCES
1 Lindley DV. Making Decisions. London: John Wiley & Sons Ltd;
1985.
2 Nord E. Cost-Value Analysis in Health Care. Making Sense out
of QALYs. New York: Cambridge University Press; 1999.
3 Coast J. Is economic evaluation in touch with society’s health
values? BMJ 2004;329:1233-6.
4 McGregor M. Cost-utility analysis: Use QALYs only with great
caution. CMAJ 2003; 168:433-4.
5 Annemans L. Incoming Presidential Address. ISPOR Connections
2004;10:1-3. |