Last summer, Maryland won a $250 million federal grant with a promise to build a model to evaluate teachers and principals that would be “transparent and fair” and tie their success for the first time to student test scores and learning.

Now, the state that prides itself on cutting-edge practices and top-in-the-nation schools is struggling — along with every state or school system that has ever tried — to come up with a reliable formula for improving the teacher workforce and rooting out the lowest performers.

Bogged down by political infighting, large gaps in technical know-how and regulatory hurdles, Maryland recently applied for a year’s extension to fully execute the evaluation system it has yet to develop.

“We knew this was going to be very difficult,” said state Superintendent of Schools Nancy S. Grasmick, who is requesting that the evaluations not carry consequences for teachers and principals until 2013-14, so schools will have more time to train and experiment. “If it rolls out too soon, it won’t be done well, and there will be reactions from teachers that this is a half-baked idea.”

Eleven states and the District of Columbia mapped out dramatic plans for school improvement last year to win shares of the $4 billion Race to the Top fund that has catapulted the Obama administration’s reform agenda and inspired a national wave of changes to teacher tenure laws and evaluations.

Now the states that volunteered to be models are feeling the intense strain of delivering on their promises within ambitious timelines. And the architects of these high-profile experiments, which were forged amid intense political debate, are bracing themselves for heat.

A council comprised largely of school administrators, union members, and politicians in Maryland was convened in August and charged with developing a model evaluation system.

But the group quickly encountered the kind of questions that are vexing school systems nationwide: What is an effective teacher? Can standardized tests for students be fair measures for teachers? What can be used in place of tests in classes like kindergarten and music that don’t usually have them? And how do you isolate the impact of one teacher when students work with specialists or outside tutors?

The council’s initial December deadline was pushed back until June. A final plan is expected to be unveiled at the end of the month.

Although specifics are still unclear, an outline has emerged: Fifty percent will be based on student growth, including test scores and other measures, and the other 50 percent will take into account professional skills, such as how well teachers understand their subjects and how they interact with students and families.

In addition to standardized tests, districts will be able to choose from a list of state-approved options — including, potentially, portfolio-style tests and classroom observations. They also can develop some measures on their own.

Maryland’s request for more time is one of dozens of proposed Race to the Top amendments pending with the U.S. Department of Education. The agency has approved more than 100 changes, mostly minor adjustments to budgets or timelines.

Reformers have sought for years to strengthen teacher evaluations and raise standards for the profession. But some are concerned that false steps could set back the movement, just as high-profile efforts a generation ago to promote merit pay and common academic standards faltered.

“States are effectively having to make a 180-degree turn in their teacher policies . . . and the tools we all need to do this fairly are at best in the 1.0 stage,” said Kate Walsh, president of the National Council on Teacher Quality and also a member of the Maryland State Board of Education. “We could really mess this up.”

Those with the most at stake are teachers, whose employment and, in many cases, salaries will be tied to the new plans. Unions in Maryland and across the country are skeptical about the imperfect science of linking test scores to their work.

“I would hate for anyone’s job to be lost because we are building a plane and flying it at the same time,” said Cheryl Bost, president of the Teachers Association of Baltimore County and a member of the council that is developing the plan.

As states develop new teacher evaluation systems, they also are upgrading academic standards, installing sophisticated data systems and awaiting next-generation assessments. Eventually, all will shape what teachers do and how their performance is measured.

Leadership changes in Maryland and other states that won Race to the Top grants are adding uncertainty and pressure. Grasmick is retiring at the end of this month. The Florida schools commissioner is resigning and Tennessee has a new schools chief.

Amid the flux, crushing timelines have been written into grants and, in some cases, state law. School districts in Florida are scrambling to design new evaluations by next fall that will link up to 50 percent of teacher evaluations (and by 2014, their salaries) to test scores. And in New York City, school officials plan to rapidly develop a slew of new tests, so there will be more tools to measure how much students learn.

Select schools in seven Maryland systems, including Prince George’s and St. Mary’s counties, are also on deck to pilot the new state evaluation system this fall.

But even after a formula is devised, any new evaluation system still has a legal hurdle to clear. Last fall, a Maryland legislative committee ruled that the federally approved plan to link 50 percent of a teacher evaluation to student growth does not comply with a state law that says student growth can be a ‘significant part’ of the evaluation but should not exceed 35 percent.

The effort also faces skepticism or outright opposition from some important players. School officials in Montgomery and Frederick counties last year refused to sign on to the state’s Race to the Top plan.

Montgomery Superintendent Jerry D. Weast, who is retiring this month, said the county is not interested in overhauling its current evaluation system, which is already considered a national model. “Why would we scrap something that is working for something that has not been built?” he said.

Efforts to craft evaluation formulas in various cities and states face unique political challenges. Florida plans to rely more on test scores, while Colorado is including more subjective measures. Ohio’s evaluation plan is tangled in a heated legislative budget debate over merit pay. And plans in New York will be subject to collective bargaining.

The District has had a teaching evaluation model in place for two years that relies on student test scores, where available, as well as structured classroom observations by administrators or mentors.

The D.C. model, which has drawn controversy, shows why it is important to win teacher support and thoroughly vet a system before making high-stakes decisions about salaries or termination, said Elena Silva, a senior policy analyst at the Washington-based think tank Education Sector.

Good evaluations, she said, “are not developed overnight.”