Nobody familiar with American medical care in the 21st century should be surprised that a 73-year-old woman can be minutes away from getting a painful collapsed vertebra filled with liquid plastic and it’s impossible to say whether the procedure works, or how.

It may be that Marcia Henry could get as much relief from injections of local anesthetic, from physical therapy or just from more time to heal as she will from the $3,137 “vertebroplasty” she’s about to undergo at the University of Virginia Medical Center in Charlottesville.

“The studies have been contradictory. Which one trumps which one? We don’t know,” says interventional radiologist Mary E. Jensen as she sits in a dimly lit X-ray viewing room and watches a colleague lay out a tray of instruments in a procedure suite next door. “It leaves the treating physician in a dilemma.”

American medical care is rife with such treatments, whose usefulness is uncertain not just to the doctors who deliver them but also to the patients who receive them.

These days, however, many people are pinning their hopes on “comparative effectiveness research” as way to solve the dilemma of how best to treat this and hundreds of other common problems in day-to-day medicine.

“What’s remarkable is how much we do with so little evidence to support what we do, especially when it comes to the patient right in front of us,” said Harlan Krumholz, a 53-year-old cardiologist and researcher at Yale University.

Comparative effectiveness research goes beyond the basic question — “Is this safe and effective?” — that must be answered before new a new drug or device goes on the market. Instead, this emerging field tries to determine where a drug, a procedure, a test or a therapeutic strategy fits into the world of what’s already available and being used.

Are pregnant substance abusers more likely to get sober if they’re treated as inpatients or outpatients? Which is better for staving off dementia in the elderly, regular exercise or brain-teaser games? What are the best strategies for treating high blood pressure in African Americans? Which is the better way of diagnosing kidney stones in the emergency room, ultrasound or CT scan?

These are among hundreds of questions being addressed by comparative effectiveness research studies now underway and funded by $1.1 billion in the Obama administration’s 2009 economic stimulus package. The purpose isn’t to declare hands-down winners (although that occasionally happens). It’s to provide practical guidance when there’s more than one reasonable option.

“For us, ‘true north’ is really what clinicians and patients need to know to make the best possible decision,” said Carolyn M. Clancy, director of the federal Agency for Healthcare Research and Quality, which this year is spending about $21 million on comparative effectiveness studies.

This has never been a high priority for the country or its scientists.

Only 1.5 percent of money spent on medical research goes to “outcomes research,” of which comparative effectiveness is a sub-category. About 13,000 new clinical studies start up each year; about 112,000 are running now. A meticulous search in 2008 revealed only 689 studies that fit the general description of “comparative effectiveness.” Many experts believe that’s not enough.

The Obama administration created a permanent stream of funding for comparative effectiveness research by establishing, as part of the Patient Protection and Affordable Care Act, an independent entity called the Patient-Centered Outcomes Research Institute, or PCORI.

The institute’s duties are to establish national priorities for this type of research, with input from patients, doctors, scientists, public-health officials and representatives of the health-care industry. It will eventually have about $550 million a year, provided by the federal government, to pay for studies and disseminate results. PCORI opened its office in Washington last month.

But not everyone is happy with this activity. Some critics see comparative effectiveness research as a Trojan horse that will eventually bring government control and rationing to every hospital and clinic in the land. Even if that doesn’t happen, they’re worried that “cost effectiveness” — a different concept, one that puts a price tag on patient outcomes such as illnesses averted or years of life saved — will make its way into medical decision-making.

There’s little doubt that an awareness of cost — a desire not to waste money — runs implicitly under much of comparative effectiveness research. President Obama acknowledged as much when he asked a rhetorical question while promoting his health-care overhaul bill:

“If there’s a blue pill and a red pill, and the blue pill is half the price of the red pill and works just as well, why not pay half price for the thing that’s going to make you well?”

‘At my wits’ end’

Marcia Henry, the woman waiting for the vertebroplasty, knew she had osteoporosis and took a drug to combat it for seven years. Nevertheless, two years ago she began having back pain that didn’t go away. As it got worse she dropped activities one by one — skiing, walks on the beach, gardening, even some household chores.

In January she got a spinal fusion in her lower back. It made her feel better. But as often happens, the operation shifted mechanical stress to other levels of her spine. Sometime in the spring she suffered a compression fracture of a vertebra. As the pain got worse, one of her physicians suggested vertebroplasty. She agreed.

“By today I was at my wits’ end,” she says as she lies on a bed in the hospital waiting for the procedure to begin.

Under a fluoroscope that provides real-time X-ray video, her spine looks like a tall ship in the fog, top-heavy with ghostly sails and spars, heeling over slightly.

Over the course of an hour, interventional radiologist Avery J. Evans and Derek Kreitel, a doctor training under him, anesthetized the fractured vertebra and gently hammered a large-bore needle into it. They then injected polymethylmethacrylate, an acrid-smelling plastic used to make the face shields of motorcycle helmets and dozens of other products.

The plastic heats up as it hardens; one theory is that it damages nerves, blocking the pain. Another theory is that the hardened plastic restores the vertebra’s weight-bearing ability. But studies show that neither the amount of plastic injected nor its location predicts whether a patient will get relief.

Vertebroplasty was invented in France. The first one in the United States was done by Jensen, 53, in 1993. Evans, 50, designed much of the equipment that was spread out on a sterile sheet on a table next to the patient.

The cost of the procedure ranges from about $3,000 to more than $14,500, depending on who pays and whether it’s done inside or outside a hospital. Medicare, the federal health insurance program for the elderly, pays for most of them.

About 79,000 vertebroplasties and “kyphoplasties” — a related procedure in which a balloon is inflated in the fractured bone to make to make more room for the plastic — were done in the United States in 2008, the most recent year for which data have been published. The total cost was $907 million, according to a recent article in the New England Journal of Medicine.

How well does it work? That depends on how you try to answer the question.

A Dutch study randomly assigned patients who had had back pain for about a month to get either vertebroplasty or standard treatment (mostly painkillers). Those who got the procedure had less pain a month later and a year later. But the researchers noticed something curious: More than half the patients they initially screened for the study didn’t have enough pain a couple of weeks later, when it came time to randomly assign them to one treatment or another. That suggested that if someone guts out a vertebral fracture for a month or so, the pain may either disappear or become tolerable.

Two other studies compared real vertebroplasty to fake vertebroplasty. (The latter procedure involved injecting local anesthetic around the damaged vertebra but never putting plastic in the bone.) The fake procedure worked as well as the real one. The patients in those two studies were a little different from the Dutch patients; they had had their pain longer, more than three months on average. Another study assigned patients to kyphoplasty or usual treatment; kypho­plasty was better.

The placebo effect, especially for procedures, is very strong. Evans thinks it plays a big part in many of the dramatic recoveries he has seen.

“The local anesthetic allows patients to get up and become active, at which point the placebo effect kicks in. The placebo effect heals people,” he said.

Jensen also thinks the placebo effect plays a part but doesn’t explain everything. A study showed that vertebroplasty worked well in demented patients, who wouldn’t be expected to anticipate benefit.

“So there is something going on that has not yet been elucidated,” she said.

There are so many things waiting to be elucidated that some researchers and organizations are trying to get the ball rolling.

One is the nonprofit Center for Medical Technology Policy . In early April, 30 people — researchers, practitioners, people from the medical-device companies, representatives of the Food and Drug Administration and Medicare — gathered at its offices overlooking Baltimore’s Inner Harbor. The purpose was both diplomacy and strategy.

Could all of the parties agree that the field of vertebral augmentation was muddled? Would it be possible to design a comparative effectiveness study or two that might bring clarity to this billion-dollar corner of American medicine?

“We need to go beyond arguments about the ‘quality of the evidence’ and start deciding how we are going to do better in the future,” Sean R. Tunis, the physician who heads the center, gently chastised the attendees. “I think we’ve got to own this problem.”

Several people commented that what was remarkable was the rarity of such a meeting. In American health care, nobody really owns the problem of figuring out what works best.

Red, blue and purple

But do people really want to know the answer? That’s a good question, too.

Nobody is publicly in favor of not finding out what works best. And most everyone agrees that uncertainty can waste money. At the same time, uncertainty can be very profitable.

That’s the lesson of the heartburn drug Nexium, which Obama could have cited in his red-pill/blue-pill parable.

Nexium suppresses stomach acid and is used to treat ulcers and acid reflux. The drug company AstraZeneca brought it out in 2001 when Prilosec, its other acid-suppressing drug, was about to lose its patent protection and go generic. The active ingredient in the two drugs is the same. Prilosec is the chemical omeprazole in two mirror-image molecular forms. Nexium, whose generic name is esomeprazole, contains only the more “active” of the two forms.

Although Nexium (“The Purple Pill”) was marketed as the newest and best treatment for heartburn, many experts believe it offers little or nothing over generic omeprazole. There’s never been a clinical trial of the two drugs at equivalent doses that showed Nexium was better at relieving symptoms.

“It’s uncertainty that has allowed the premium pricing,” said Tunis of the Center for Medical Technology Policy. “Through marketing you could create the impression there were important differences.”

The strategy worked. Nexium captured a huge piece of the heartburn market that might have gone to its generic twin. In 2010, the value of “Purple Pill” sales ($6.2 billion) was seven times as high as the value of generic omeprazole prescriptions ($858 million), according IMS Health, a health-care information company.

But comparative effectiveness isn’t just a threat to “me-too” drugs such as Nexium. Some people see it as a threat to something much bigger and more important — “personalized medicine.”

Personalized medicine envisions a time when treatments are customized to patients’ age, sex, genes, race and ethnicity as well as the molecular profile of their cancers, their risk for other ailments and the drugs they’re taking. It’s medicine tailored to one patient. Comparative effectiveness, on the other hand, involves drawing conclusions from the experience of groups of people. It requires making generalizations.

Clancy, of the Agency for Healthcare Quality and Research, doesn’t see a conflict.

“Being able to be more precise about which subgroup of patients will do better on which intervention — I see that as the first steps on the path to personalized medicine,” she said.

And in the end . . .

Did vertebroplasty work for Marcia Henry? She’s not entirely certain.

“It took a couple of days for me to realize that probably it did work,” she said several weeks after the procedure. “I still had a pain, but it probably was not from the fracture. It was a different pain.”

Would she recommend it to someone else? “I’d say go ahead and do it.”

Is she able to be more active? “Not much more,” she said.

Is there a better way than this?

That’s the question with no answer.