Peter Angelos remembers being called to perform surgery on a patient whose life was in peril. "If I don't operate, the chance that the patient dies is 100 percent," said Angelos, director of endocrine surgery at Northwestern Memorial Hospital in Chicago. "If I do intervene, the chance of his dying is still 98 percent. If the patient dies on the table, did I cause his death, or was I just unable to save him?"

Unlike other physicians, surgeons are inextricably intertwined with their results. The internist is rarely blamed when the patient with coronary artery disease has a heart attack, even if he failed to place the patient on daily aspirin therapy, a proven way to prevent cardiac events. The incident is attributed to the disease process, not to the internist's oversight.

But when a patient dies in the hands of a surgeon, the surgeon is often viewed as an active participant in the death. As surgeon and medical sociologist Charles Bosk writes in his book "Forgive and Remember" (University of Chicago Press, 1981): "The specific nature of surgical treatment links the action of the physician and the response of the patient more intimately than in other areas of medicine."

This link -- combined with a rising trend to gather and publish outcome data on various medical procedures -- may affect surgeons' decisions about whether to operate on critically ill patients. When the likelihood of survival is slim, it may seem more prudent, at least from the surgeon's perspective, to deem a patient "not a good surgical candidate" or "too sick for surgery" and let the person die outside the operating room rather than on the operating table.

But of course, high-risk procedures sometimes prevent death. How can we ensure that publishing outcome data doesn't discourage surgeons from taking high-risk cases?

Increasingly outcome data are made available to the public in the form of "report cards" that rate health care providers. The Maryland Health Care Commission, for instance, publishes a hospital guide on the Internet (http://hospitalguide.mhcc.state.md.us/index.htm). A Web enterprise called Healthgrades.com (www.healthgrades.com) gathers outcome and performance data on hospitals, nursing homes and other care providers. In 1996, the Pennsylvania Health Care Cost Containment Council, a state agency, published a consumer guide to coronary artery bypass graft surgery, listing each surgeon's mortality rate.

A study that appeared in the New England Journal of Medicine in 1996 surveyed a large sample of Pennsylvania's cardiologists and cardiovascular surgeons to determine how the publication of those statistics affected the delivery of medical services.

The most disturbing finding, according to the authors, was that cardiovascular specialists believed that access to care for very ill patients had decreased due to the publication of these report cards. Fifty-nine percent of cardiologists reported that it had become much more difficult to find a surgeon willing to operate on the most severely ill patients. Likewise, 63 percent of cardiovascular surgeons reported that they were less willing to operate on the most severely ill patients.

"There are many institutions that would not touch a sick patient because they want to keep their mortality rates low. This happens, unequivocally, all the time," said physician Shukri Khuri, chairman of the National Surgical Quality Improvement Program of the Department of Veterans Affairs (VA). "At the VA, we see those patients who are denied surgery [elsewhere] because they are high-risk. Programs frequently select low-risk patients in order to keep their mortality rates low," Khuri said.

"This will happen more often if poorly constructed outcome data becomes the norm."

The Raw Facts

Outcome data are the linchpin of "quality assurance," the movement to monitor the results of treatment delivered by individual providers and institutions. Quality assurance is driven partly by insurance companies hoping to deliver health care at lower cost and partly by physicians seeking to employ a more evidence-based approach to medical care. The focus on outcomes is also endorsed by some consumer advocates, who argue that if accurate data about outcomes are available to patients, they will be able to choose providers more wisely, thereby increasing market pressure on providers to improve the quality of their care.

In theory, outcome data supply physicians, health plans and patients with hard facts instead of subjective measures such as reputation or word-of-mouth. But if the outcome data are fundamentally flawed, the consequences could be enormous: Physicians might be deterred from taking hard cases, and patients desperately in need of care might have nowhere to turn.

In the world of medical statistics, there are different ways to calculate survival. One is called absolute survival, a figure, usually cited as a percentage, that reflects how many patients receiving a particular procedure survive. The other, called "expected survival" or "risk-adjusted survival," takes into account a patient's unique characteristics -- such as underlying illness, age and lung function, for example -- that may affect his or her likelihood of surviving. Many experts believe surgery programs cannot be compared properly until all are viewed according to risk-adjusted criteria.

Robert Kotloff, director of the Program for Advanced Lung Disease and Lung Transplantation at the University of Pennsylvania, says lung transplant programs must be cautious when they select patients because Medicare and private insurance companies use raw survival data to evaluate programs.

During its Medicare review, the University of Pennsylvania had to satisfy two markers of quality assurance. One was volume, and required transplanting at least 10 patients a year for two consecutive years. The other was to achieve a one-year and two-year absolute survival rate of 69 percent and 62 percent, respectively.

Absolute survival, according to Kotloff, is an inappropriate way to evaluate medical centers. "There is a difference between a center that transplants a large number of patients with pulmonary hypertension and one that primarily transplants patients with emphysema," he explained. Patients with pulmonary hypertension have poorer survival rates because the high pressure in their pulmonary vessels weakens the heart. After transplanting a new, more elastic lung, the heart usually bounces back, but sometimes the damage that has occurred is irreversible. On the other hand, emphysematous lungs have flexibility; they do not place extra burden on the heart. These patients have the highest survival rates of all patients with terminal lung disease. A lung transplant center could therefore improve its absolute survival numbers by simply transplanting more patients with emphysema and fewer with pulmonary hypertension.

"If you are just going to look at absolute statistics," Kotloff said, "you are going to create a system that would favor transplanting more technically straightforward, low-risk patients."

And in fact, a program's statistics do influence treatment decisions, according to Kotloff. "If you see that your survival rate is suboptimal because you are being aggressive [in selecting patients], you tend to be more conservative when you list your next 20 patients."

When the University of Pennsylvania started its lung transplant program, Kotloff said, it accepted riskier patients than it does now. A new program often has to fill its waiting list with patients who have been declined at other centers. "Our survival statistics early on were not as good as they are now," Kotloff said. "We became keenly aware of our survival statistics to ensure Medicare approval. You are judged by your survival statistics, so it colors your selection of patients, to some extent."

Kotloff says he never alters the seniority listing of his patients to skip over high-risk patients in favor of low-risk ones. Patients receive transplants in order of time accrued on the waiting list. But when a program feels vulnerable, the texture of a waiting list will change: It will contain fewer high-risk patients.

Absolute Survival

To funders of medical care, the reliance on absolute survival statistics seemed to make sense. According to one Medicare official, when developing standards in the early 1990s agency staff felt that scarce organs should be given to patients who had the best chance of surviving surgery, and a center's absolute survival statistics would be an indicator of its quality. Medicare also did not want to compensate hospitals for undertaking high-risk surgeries. It looked skeptically upon risk adjustment, viewing it as a statistical manipulation to justify a program's decision to give an organ to someone who might die early.

Patients voiced concern to federal officials that surgical programs would refuse to treat severely ill patients to protect their statistics. And the premier transplant programs complained that as comparative outcome data became more available on the Internet, absolute survival data would lead patients to invalid conclusions about a program's quality.

Over time, Medicare's approach has changed. It is now considering risk-adjusting outcomes for lung transplantation and is in the process of developing new criteria for doing so. This dramatic change of direction will affect how lung transplant programs are evaluated by private insurance companies, which typically follow Medicare's lead.

Loyola University Medical Center, the largest lung transplant center in Chicago, did take risks by transplanting patients at a more advanced stage of their illness, according to Edward Garrity, medical director of the program. After Loyola's survival rate decreased, in 1998 Blue Cross Blue Shield of Illinois dropped the program from its network of approved providers, leaving its 4.7 million subscribers with no in-state lung transplant option.

Blue Cross Blue Shield had drawn a line and Loyola fell below it. Studies by the United Network for Organ Sharing show that Loyola's one-year survival rate was 64 percent for transplants performed from 1995 to 1997, a time when the survival rate across the country was 73 percent. Since its inception in 1988, the Loyola program had always had survival rates at or above the national average.

When patients were listed at Loyola in the early 1990s, it was the only lung transplant program in Chicago, performing on average 30 lung transplants a year. But in the mid-1990s, five other transplant programs opened, increasing the competition for lungs at a time when the number of available lungs was decreasing across the country. Patients' time on Loyola's waiting list went from six months in 1994 to two years by the end of 1996.

Garrity said that because patients spent more time waiting for a lung, they were considerably sicker by the time surgery was possible, and therefore their absolute survival rate dropped. "We made our clinical decisions based on potential patient benefit, not based on managing our statistics."

Garrity said that Loyola always reassesses its waiting list, removing patients when it becomes apparent they likely would not survive surgery. But this is a gray area: It is not always easy to gauge when a patient is too sick to benefit.

Lung transplantation involves balancing risks and benefits. If doctors predict that a patient with severe lung disease has a 75 percent likelihood of being alive after two years without surgery, the patient should not be transplanted, Garrity said, because the risk of dying from transplantation is higher than the risk of dying from the underlying disease. A patient can become a transplant candidate when the risk of dying from the disease is greater than the risk of transplantation. These two lines cross, but doctors cannot always tell where.

"How much risk are you willing to take? How many pieces of risk make this patient undoable?" said Garrity. "That is a very hard equation to work out every day, but that's exactly what we are charged with. If you add other illnesses to the mix, like coronary artery disease or high blood pressure, the risks start adding up. But there is no cut-off number. It's not as precise as that."

In retrospect, Loyola's increase in mortality may have been a temporary aberration, not a true reflection of a decline in the program's quality. After 1998, Loyola's survival rates returned to their previous levels.

A decision is only as good as the data upon which it is based. In Loyola's case, its transplantation program was penalized because the data upon which it was judged did not reflect how sick its patients were.

Costs and Benefits

In 1986, the Health Care Financing Administration (HCFA) -- now known as the Centers for Medicare and Medicaid Services -- publicly released raw mortality data for coronary artery bypass grafting, commonly known as bypass surgery, causing an uproar in the cardiothoracic community. The Cardiac Surgery Committee, established to monitor surgical care within the VA, quickly realized that the raw statistics did not reflect the quality of care patients received. The committee feared this type of data would deter surgeons from accepting the most challenging patients.

"If surgeons are being judged by their outcomes, and outcomes do not reflect the risks patients bring to surgery, surgeons may feel conflicted. This system creates a negative incentive to operate on high-risk patients whose only chance for survival may be an operation," said Frederick Grover, a cardiovascular surgeon who has chaired the VA committee since its inception in 1985. "There is no doubt that every surgeon should be, and usually is, committed to doing the very best for the patient independent of how their data is reported. But, in the real world, consciously or subconsciously, mortality data sometimes influences decisions."

Most cardiovascular surgeons were convinced that HCFA's treatment of outcome data was an injustice to both patients and physicians. Surgeons would be penalized for taking high-risk patients to the operating room and high-risk patients would consequently be denied surgery.

Camella Coyle, senior vice president of the American Hospital Association, says that HCFA's data held the potential to misinform consumers. "If you have a rare condition, you may need to be at that preeminent academic facility. That academic center may have a higher mortality rate, but it's because it treats people who are sicker and have unique conditions. HCFA's data wasn't very informative. It did not tell patients whether the higher mortality rate actually meant the hospital was worse."

Some hospitals that fared well on HCFA's list used the data in self-serving advertisements, so in the late 1980s, the VA and the Society of Thoracic Surgeons (STS) developed national cardiac databases to adjust outcome data for risk. More than 2.2 million cardiac patients have been entered into these databases, the largest undertaking of its kind in medicine. Approximately 450 of an estimated 700 cardiovascular practices nationwide participate voluntarily in the STS database.

The first step in this complicated process is to construct a model that predicts outcome. The most commonly used statistical method, logistic regression, produces a formula to calculate the probability of an outcome as influenced by "predictor" variables. Today, the VA database includes 10 predictor variables, while the surgeons' group database uses 25. A surgeon's adjusted death rate is calculated from these models and compared with his absolute death rate.

Michael Frank, a cardiovascular surgeon at Evanston Northwestern Healthcare in Illinois, says the STS computation has allowed him to take on the tough cases.

Early in his career, Frank said, he made an ethical decision to use his surgical skill to improve the lives of the sickest patients. Studies show that patients who have three narrowed coronary arteries (vs. one or two), poor contraction of the heart and diabetes have the most to gain from bypass surgery, but also the most to lose because of possible cardiac events during and after surgery. These patients make up more than half of Frank's practice, a larger share than most heart surgeons? Most young surgeons hesitate to operate on these very ill patients because too many deaths too early could impede their careers. A high mortality rate could make it hard for a surgeon to join reputable programs and it could increase malpractice insurance rates. Likewise, taking on tough cases could expose a surgeon to a greater risk of litigation.

Frank's two-year absolute mortality rate was on par with the national average. However, his risk-adjusted mortality rate indicated he was providing superior surgical care: His ratio of observed to expected mortality was 0.39. In other words, only a third of Frank's patients who, because of their risk factors, would be expected to die actually did so.

"A good report in the risk-adjusted category empowers me to continue taking on high-risk patients. It is positive feedback," Frank said.

Building the Database

Risk-adjusted data yield essential information that help surgeons tailor the care they provide and take the risks they are capable of taking.

Grover, the VA committee's chairman, said surgeons can also use databases to identify areas for improvement. For instance, St Peter's Hospital in Albany, N.Y., had a higher than expected surgical mortality for coronary bypass from 1990 to 1992.

A detailed analysis by the hospital revealed that the excess mortality came from patients undergoing emergency bypass surgery. Surgeons discovered that they were taking unstable patients to the operating room too quickly. Patients who were less than six hours from a heart attack or in shock did not fare well. The staff responded by stabilizing these patients for at least 24 hours prior to surgery, resulting in no deaths the following year for emergency bypass.

The VA and STS have a tradition of using their databases for research. The databases offer surgeons a wealth of information that they can analyze to prove or disprove observations they gather in the trenches.

Yet risk-adjusted outcome -- the holy grail of quality assurance -- does have its limitations. When the New York State Department of Health pioneered risk-adjusted report cards to evaluate the quality of cardiovascular care, officials noticed that doctors suddenly started reporting more risk factors. Indeed, at one hospital the reported prevalence of chronic obstructive pulmonary disease (COPD) increased from 1.8 percent of the patient population to 52.9 percent, while at another hospital the diagnosis of unstable angina went from 1.9 percent to 20.8 percent. While it is possible that surgeons simply began reporting actual risk factors more thoroughly in light of the new system, they were widely suspected of inflating risk factors, a tactic known as "gaming," to make their results seem stronger.

Many parties use billing codes and Medicare forms to collect outcome data, even though such sources of information have been shown to be often inaccurate and incomplete.

Databases also cannot capture every possible risk factor. They do not assess how a patient's quality of life influences mortality, even though surgeons know it has an impact. Some coronary bypass patients have rare diseases, like leukemia or clotting disorders, that are too uncommon to be accounted for statistically. Likewise, in the very high-risk group (patients with a predicted operative mortality of 75 percent or more), there are not enough cases to make accurate predictions.

Another drawback is that the STS database, which is not accessible to consumers, is a costly endeavor. A surgical program or physician group may pay as much as $100,000 a year to run it.

Even in light of these challenges, Coyle says the health care community has an obligation to provide meaningful, comparable data that truly reflect the performance of providers.

"Risk adjustment" she says, "makes certain we have ruled out anything else that may move an outcome in one direction or another."

Behold the List

Americans are fascinated by lists, an obsession that the media feeds by ranking every aspect of our lives, from restaurants and movies to universities and vacation retreats. It's not surprising that the media ranks the "who's who" of health care providers.

In its annual "Best Doctors" issue, New York magazine teams up with the medical research firm Castle Connolly and asks New York physicians to "choose their worthiest peers." Castle Connolly sends questionnaires to 8,000 physicians asking, "To whom would you send a member of your family?" Washingtonian magazine uses a similar methodology. Nationally, reputation also earns a physician a mention in books like "America's Top Doctors."

Outcome data that do not account for underlying illness are probably no more accurate an indicator of treatment quality than are these subjective surveys of peers. In fact, in its potential to rank good doctors as bad and bad doctors as good , raw outcome data may be a good deal worse.

While it would probably be considered preposterous for a respectable health insurer to remove a hospital as a preferred provider because it did not make U.S. News and World Report's "America's Best Hospitals" list, insurance companies do not hesitate from doing so based on raw outcome data.

A recent study of 44 hospitals demonstrated how centers that ranked lowest by raw mortality moved to the top of the list by risk-adjusted methods.

"The travesty is that you are not passing the right judgment," said the VA's Khuri. "There are many organizations grading hospitals based on unadjusted or poorly adjusted outcomes. You then disenfranchise quality institutions because of incorrect data. In essence, you deny patients care at superior institutions that appear suboptimal because of incomplete data."

Liza Iezzoni, professor of medicine at Harvard and board member of the National Quality Forum, a public-private organization that focuses on improving the quality of health care, believes outcome data should be used constructively, not punitively.

"Outcome data is invaluable when it motivates physicians to be introspective and evaluate processes of care. But I am concerned when third-party payers use outcome data to punish health care providers, because this type of data does have inherent limitations."

All of the participants in the health care system -- HMOs and third-party insurers, physicians and patients -- require meaningful information about quality of care. If the data being collected and analyzed are incomplete, the conclusions are not meaningful, and can produce disastrous consequences. Entire patient populations can be left without options. Good doctors can make bad decisions to manipulate data that is not truly connected to quality.

A movement designed to protect patients and improve the quality of care can wind up having precisely the opposite effects.

*

Jennifer Obel is a resident at a Chicago hospital. She last wrote for the Health section about performing CPR on terminal patients.