Introduction (The past is prologue to the future)
This report has been developed to contribute to ongoing internal discussions at the Agency for Health Care Policy and Research (AHCPR) and for consideration by those involved in Outcomes and Effectiveness Research (OER) in three main areas:
- Developing a framework for understanding and communicating the impact of OER on health care practice and
outcomes.
- Identifying specific examples of projects that illustrate the research impact framework.
- Deriving lessons and options from past efforts that may help develop strategies that will increase the measurable impact of future research sponsored by AHCPR.
The paper first provides some background on the development of the original effectiveness initiative in the late 1980s, then describes a framework for organizing the various impacts that OER has on practice and outcomes. Using this framework, and a series of case studies collected by staff from the Center for Outcomes and Effectiveness Research (COER), we list broad accomplishments associated with funded efforts by AHCPR. We then discuss some major lessons that have been learned over the past decade regarding OER, and finally offer a number of specific recommendations that AHCPR should consider in strategic planning. The primary question toward which this analysis is targeted is: "How can the OER program at AHCPR most effectively advance the field of health services research (HSR), contribute to public health, and address the expectations of policymakers and stakeholders?"
The establishment of AHCPR and the OER program in late 1989 stimulated the development of a series of new methods to relate the processes of care to the outcomes that people experience and care about. While relatively simple in concept, OER in fact represented a significant departure from traditional clinical research, in response to payers' and policymakers' concerns that widely demonstrated practice variations represented important sources of "cost without benefit." From the outset, however, the boundaries of OER have not been sharply defined. AHCPR's funding line in the budget labeled "MEDTEP" (Medical Treatment Effectiveness Program) supported both OER and clinical practice guidelines development. In fact, AHCPR's enabling legislation reflects a tension between an expectation that research conducted in typical practice settings would lead to sustained changes in practice, and an opposing belief that a specific intervention in the form of guidelines would be needed to facilitate improvements in practice. Responses to internal efforts to solicit customer input for future priorities (e.g., a Federal Register notice published in 1996) indicate that many people outside of AHCPR still consider clinical practice guidelines to be an intrinsic component of "OER." Indeed, AHCPR's annual reports to Congress on the status of the MEDTEP program focused almost exclusively on guidelines rather than research.
The focus of this report, however, is the research program. Just as OER represents a new approach to evaluating clinical practice, so too this analysis attempts to evaluate the value of research investments over the past decade. Evaluation of research investments has not been a systematic component of the research enterprise in the United States (OTA, 1994), so there are few precedents to inform the approach used here. However, it is and was eminently clear that AHCPR's existence reflected a strong belief on the part of Congress and other stakeholders that "success" should be assessed in terms that move well beyond the traditional outputs of the research enterprise (i.e., publications). While one clear motivation for producing this report was a perception that OER specifically, and the work of AHCPR generally, is not always supported unequivocally by policymakers (particularly members of Congress), it is also our hope that this effort ("the outcomes of outcomes research") is an important first step to redefining the goals of OER, as well as an honest appraisal of prior successes and opportunities for improvement.
President John F. Kennedy, quoting the philosopher George Santyana, once remarked that those who could not learn history's
lessons were doomed to repeat them. The most important purpose of this analysis is to build on what has been done well
and to learn from what has been done less well in order to develop strategies to enhance the visible impact of sponsored
research in the future.
Return to Contents
Background
History of the science
The expectations that surround outcomes research can best be understood in the context of events that contributed to the
establishment of AHCPR in 1989. One major milestone was the implementation of the prospective payment system (PPS)
for Medicare inpatient care in the mid-1980s. Soon after the system was in place, the public and policymakers became
concerned about Medicare patients being forced out of the hospital because "their DRG had run out." (DRG is Diagnosis
Related Group and is a lump sum payment made by the insurer to the hospital based on the illness of the patient rather than
the number of days in the hospital or the type of care that was provided for the patient. The concern was that patients
would be discharged if they had stayed in the hospital a certain number of days rather than when they were ready clinically to leave.) At congressional hearings on the quality
of care under Medicare, the phrase "quicker but sicker" captured central concern about the impact of the new financial
incentives. William Roper, who became HCFA Administrator in 1986, promoted the use of Medicare databases to monitor
the quality of hospital care through measurement of mortality rates, readmission rates, and other adverse outcomes.
In the decade prior to institution of PPS, John Wennberg and others had been developing a conceptual framework and methods for
exploring the impact of health care services on patient outcomes. In 1987, several meetings were convened by the Department of Health and Human Services (HHS) that included Roper, as well as Wennberg, David Eddy, Robert Brook, and others to explore whether Medicare databases
could be useful on a large scale for quality monitoring and improvement. Wennberg's work on geographic variations in medical practice
(McPherson et al., 1982), studies on appropriateness of care led by Brook (Leape et al, 1990; Chassin et al., 1987), and Eddy's analysis of
the poor quality of medical evidence (Eddy and Billings, 1988) set the stage for a major Federal initiative to improve the knowledge
base for medicine. Roper and others announced this effectiveness initiative in a New England Journal of Medicine article
in 1988 (Roper et al., 1988). The major responsibility for carrying this initiative forward found an institutional home when
AHCPR was established in 1989.
At congressional hearings in support of the potential value of OER, John Wennberg was a frequent witness. In one of his
appearances, he described his work comparing patterns of practice and outcomes in Boston and New Haven, showing that
the additional resources consumed in Boston were not associated with better outcomes when compared with the thriftier New
Haven practice patterns. Wrapping up his testimony, he advised members of Congress that "If 10 Bostons could become
New Havens, the savings to Medicare would amount to $500 million." By implication, for this to happen, the OER
community would have to conduct "the necessary scientific studies that allow physicians to define optimum treatments..."
(Wennberg, 1984). This framed the conceptual paradigm for (and expectations of) effectiveness research: Through the
retrospective study of patterns of care, optimal treatments would be defined, and substantial economic savings would be
achieved.
The effectiveness initiative itself represented an important hypothesis: Guidance for optimal medical practice could be
gleaned from analysis of data routinely gathered in the process of delivering and paying for patient care. AHCPR is in part
the institutional embodiment of that hypothesis, and the output of the past decade offers some empirical evidence with
which to assess its validity.
While some influential members of Congress were convinced of the value of establishing a major new program in OER,
much larger forces and stakeholders dominated the politics surrounding the establishment of AHCPR (Gray, 1992). With the proposal to establish the Agency virtually dead because of the requirement of the balanced
budget amendment, the American Medical Association (AMA) pushed strongly to keep the Agency. The AMA was negotiating physician payment reforms,
and probably supported AHCPR in part to rationalize its opposition to expenditure targets with automatic fee reductions.
Support of OER demonstrated the medical profession's commitment to reducing waste through scientific study and
evidence-based guidelines. There was not, however, a widespread and deep-seated belief among policymakers and
stakeholders that Federal support for OER (and guidelines) was essential.
These few historical observations highlight at least three important themes about the policy context of OER that continue
to be relevant. First, the effectiveness initiative was explicitly constructed around substituting analytic efforts for clinical
trials. Database analysis, systematic literature review, decision analysis, and guideline development were the
methodological staples of early OER. The extent to which questions about "what works" in health care could be answered
with these methods is gradually becoming clearer. Second, policymakers were explicitly told that large, measurable
savings would result from better studies of the effectiveness of health care. And third, the political support of organized
medicine was context-dependent, and would not necessarily be dependable over time. There was not a deeply held
commitment to OER in the policy or professional community that could be relied on when the value of this activity was re-examined by Federal policymakers.
Taken together, these themes highlight AHCPR/COER's current and future challenge. The substance of the work
undertaken by this Agency is analytically complex, and often apparently resistant to easy translation for policymakers and
clinicians. Despite this complexity, expectations continue to be very high that research done by the Agency will have
clear, measurable impact on health care quality and costs.
Return to Contents
Definition of OER
The terms "outcomes research" and "effectiveness research" have been used to refer to a wide range of studies, and there is
no single definition for either that has gained widespread acceptance. As these fields evolved, it appears that "outcomes
research" emerged from a new emphasis on measuring a greater variety of impacts on patients and patient care (function,
quality of life, satisfaction, readmissions, costs, etc). The term "effectiveness research" was used to emphasize the contrast
with efficacy studies, and highlighted the goal of learning how medical interventions affected real patients in "typical"
practice settings (OTA, 1994). Effectiveness studies sought to understand the impact of health care on patients with
diverse characteristics, rather than highly homogeneous study populations. While the terms may have different initial
roots, there does not appear to be much value in distinguishing these activities, and the field is generally referred to as
OER.
For purposes of this paper, we have adopted the following definition:
OER evaluates the impact of health care (including discrete interventions such as particular drugs, medical devices, and procedures as well as broader programmatic or system interventions) on the health outcomes of
patients and populations. OER may include evaluation of economic impacts linked to health outcomes, such as cost-effectiveness and cost utility. OER imphasizes health problem- (or disease-) oriented evaluations of care delivered in
general, real-world settings; multidisciplinary teams; and a wide range of outcomes, including mortality, morbidity,
functional status, mental well-being, and other aspects of health-related quality of life. OER may entail any in a range of
primary data collection methods and secondary (or "synthetic") methods that combine data from primary studies (Mendelson et al., 1998).
Technically, studies that describe patterns of care without reporting "outcomes" might be more appropriately called health
services research rather than OER. For example, a study that shows rates of cardiovascular procedures vary by race or
gender might be an OER study if it also reported mortality for these demographic groups, but not if it reported only the
utilization patterns alone. For this report, we consider descriptive studies of patterns of care to be part of the spectrum of
OER studies. This is in part because these studies often provided the initial "map" that made subsequent outcomes studies
possible, and in part because a focus on variations in practice was a critical stimulus for identifying important topics for
further studies that did explore outcomes.
Return to Contents
Data sources
A number of resources were used to develop this report, including: review of published articles from grants, review of
original conceptual papers for launching the "outcomes movement" as well as critiques of same; an analysis of private
sector involvement in OER conducted by Lewin; a survey of Principal Investigators (PIs); interviews with selected PIs;
interviews with a former Director of the Center for Medical Effectiveness Research, AHCPR; discussions with COER staff; and
recommendations made by investigators and stakeholders at two expert meetings (January and October 1997). A framework for
assessing the impact of funded studies was developed and used in the analysis, along with selected "all star"
case studies of specific projects.
Return to Contents
Framework—From Research Findings to Clinical Excellence
Description
One major impetus for developing this report on the status of OER was the need to translate a growing body of research
into relevant insights for policymakers (public policymakers, systems leaders, and clinical policymakers). In addition, the
challenge of identifying evidence of impact on clinical practice, which often occurs long after the grant support has
concluded, stimulated a clear need to determine where and when AHCPR-supported OER has influenced practice. This led
to some careful thinking about the different types of results, or impact, that are prompted by OER. A framework was
developed that outlines an idealized process by which basic findings in OER are linked over time to increasingly concrete
impacts on the health of patients. This framework was then developed into a more detailed conceptual diagram (select to access Figure
1, 12 KB). Examples of the various levels are provided in Table 1.
Level 1 impacts: All effects of
research studies that do not
represent a direct change in policy
or practice. This includes new tools
and methods for research,
instruments and techniques to assist clinical decisionmaking, and studies that identify areas in which scientific knowledge
is needed. For example, some studies have produced analytic tools for use in other research or clinical practice,
such as the VF-14 (the Visual Function-14 measure), the benign prostatic
hyperplasia (BPH) symptom index, and severity adjustment methods such as
the Total Illness Burden Index. Level 1 impacts are also produced when studies describe findings inconsistent with current
clinical paradigms, and stimulate rethinking and questioning within a clinical specialty.
Level 2 impacts: A policy or program is created as a direct result of the research (e.g., use of the information by health
plans, professional organizations, legislative bodies, regulators, accrediting organizations, etc.).
Level 3 impacts: A change in what clinicians or patients do, or changes in a pattern of care. Level 3a impacts are those
that are demonstrated in a limited study population as a result of a specific intervention. Level 3b impacts are trends
identified outside a formal research context.
Level 4 impacts: Actual impact on health outcomes (clinical, economic, quality of life, satisfaction). Level 4a impacts are
those demonstrated in a limited study population as a result of a specific intervention. Level 4b impacts are those
identified outside a formal research context.
Table 1. Levels of Impact and Examples
Level 1: Impact on knowledge base, future research. Adverse drug events occur in 6.5 percent of admissions and result in additional length of stay of 2.2 days and costs of $3244 (Bates et al., 1995; Bates et al., 1997).
In a study of long-term outcomes following lumbar disk surgery, a literature synthesis showed that there was better short-term relief with surgery than conservative care but after 4 years, outcomes were similar. Ten percent of patients underwent additional surgery (Deyo and Patrick, 1995).
Level 2: Impact on policies and change agents. Children receiving less expensive antibiotics for otitis media did as well or better than those receiving more expensive antibiotics. Led to development of guidelines by the American Academy of Pediatrics recommending less expensive antibiotics and HEDIS quality measure (Berman et al., 1997).
PTCA mortality is related to volume of procedures performed by the cardiologist and the hospital. Led to recommendations by the American College of Cardiology (ACC) and American Heart Association (AHA) to raise volume requirements for cardiologists (Hannan et al., 1997; Hirshfeld et al., 1998).
Level 3: Impact on clinical practice. Dissemination of information about indications for antenatal corticosteroids increased their use from 20-70 percent of appropriate cases; the increase was significantly more in hospitals with active than passive dissemination efforts (Goldenberg, 1998).
Developed VF-14 measure to assess indications for and outcomes after cataract surgery. Replaced visual acuity as gold standard. Now routinely used by ophthalmologists and required by the National Eye Institute for sponsored research (Steinberg et al., 1994).
Level 4: Impact on patient outcomes. Data feedback, training in continuous quality improvement and visits to other medical centers improved CABG mortality by 24 percent (O'Connor et al., 1996).
The Pneumonia Severity Index was used to triage patients with community-acquired pneumonia to inpatient or outpatient therapy. Patients triaged to outpatient care were more satisfied with their care and returned to work and usual activities more quickly. Outpatient care was safe and resulted in measurable savings (Fine et al., 1994).
Return to Contents
Implications
According to this model, impacts at lower levels may be prerequisites for achieving impact at higher levels. Improvements
in outcomes (level 4) are built on a foundation of studies that have identified problems, created new analytic and
measurement tools to explore those problems, and compared different approaches to managing the problem. A framework
for evaluating the success of OER provides a context for linking progress in basic studies with changes in practice and
improvement in outcomes. It is a conceptual model rather than a literal step-by-step description of the OER process. The
process from scientific knowledge development to its practical application will rarely be as systematic and orderly—or
strategic—as described in this framework. The main purpose of this framework is to emphasize the relationship between
research that does not directly induce or document changes in patterns of care or improved outcomes and subsequent
improvements in population health.
The levels of impact also make clear the challenge facing AHCPR in conveying to policymakers the value of OER.
Judgments about the value of OER will depend heavily on the level of impact expected by sponsors, stakeholders, and
policymakers. That perspective will determine whether level 1 or 2 impacts are understood to be important contributions
or another example of wasted Federal research dollars.
Level 1 impacts clearly make a contribution to the health care knowledge base. For example, extensive literature reviews
were done by the low birthweight Patient Outcomes Research Team (PORT) to document the lack of evidence of benefit for numerous popular interventions in
pregnancy (NEJM, August 1998). This would be classified as a level 1 impact, but potentially of considerable importance,
so that certain practices can be discouraged, and so that an appropriate research agenda for promising but unproven
interventions can be begun.
Documentation of level 2 impacts provides suggestive evidence that a change in a health outcome will result, but may still
be viewed as inadequate by policymakers. For example, the inclusion in HEDIS of the rate of post-myocardial infarction beta-blocker in the
elderly will probably lead to greater use of that therapy. Randomized studies have already demonstrated that mortality will
decrease with use of beta-blockers in these patients. In this case, the level 2 impact (a policy change by National Committee for Quality Assurance [NCQA]) is very
likely to prompt improvement efforts in many organizations. Past experience suggests that the introduction of new quality
measures is associated with successful interventions to alter practice, but this connection may not be self-evident to key
decisionmakers.
Even when level 3 or 4 impacts are observed, it is rarely straightforward to link these changes in health care practice or
outcome to studies that may have contributed to them. Other factors will usually need to be identified to provide an
adequate explanation for why a specific health improvement occurs. The complexity of the process of health care
decisions, and the health care system itself, ensures that any change in that system will be the consequence of multiple
interrelated factors, some of which are controllable, and others of which are not. For example, the 50 percent reduction in
the rate of prostate surgery over the past decade is associated with several important changes in knowledge about and
treatment of prostatism (such as new drug therapy as well as improved understanding of the risks and benefits of
treatment). While many observers credit the PORT investigators with the trend toward less aggressive surgical
management of benign prostate disease, it is challenging to isolate the role of an individual factor when many forces are at
work simultaneously.
The impact framework draws upon and highlights several observations about the process by which OER builds upon itself
and influences health care policy and practice. First, it helps to explain the incremental nature of the process that begins
with discovering which questions to ask and ends with improved health outcomes. A better understanding of this process
by researchers and Agency staff may stimulate improved explanations of the value of research with low level impacts.
Second, the framework portrays a process of change that is more complex and subject to external influences than was
understood when AHCPR was established. The simplified view widely held in 1989 was that clinicians directly
incorporated new research findings into practice, and were therefore the primary audience for OER. Changes in the
external environment of health care have made it clear that clinicians are influenced by multiple factors and forces, and that
information is necessary but not sufficient to influence behavior (Davis et al., 1995). Level 2 of the framework identifies a
number of "change agents" through which research findings may be transmitted to decisionmakers. This reflects the
growing recognition that the policies, organization, financial arrangements, and other features of health care organizations
play an important role in the translation of research into practice. In order to have practical value, research on medical
effectiveness must be designed to capture the heterogeneity of organizations.
The first wave of OER studies sought to
understand the relationship between patient characteristics and outcomes. It is now clear that understanding the
relationship between characteristics of health organizations and outcomes is another requirement for producing useful
effectiveness studies. Figure 2 (6 KB) provides a conceptual model which blends the levels of impact with this understanding of
the importance of organizations.
Third, the complexity of the change process ensures that any beneficial changes that do occur will be difficult to trace back
to OER studies that may have contributed to them. The Agency needs to be more purposeful about identifying high level
impacts and tracing them back to related OER projects.
Fourth, the framework begins to provide some guidance for considering strategies to improve impact, and to assess any
improvements that occur. While the process is long and complex, an important challenge of AHCPR is to accelerate this
time frame. The framework offers one roadmap to be considered by researchers and funders in considering new studies and
implementation efforts. Coordinated strategies to achieve level 3 and 4 impacts should be formulated early and updated
often as part of the research process.
Finally, by clarifying different types of "impact" that occur at different levels, the framework defines accountability for the
OER enterprise. We can identify what investigators mean by impact, and what policymakers understand impact to be.
While these definitions may have differed in the past, the more detailed understanding of "impact" should help focus
discussions of what still needs to be achieved.
Return to Contents
Proceed to Next Section