My wife sometimes shows me paint samples. I tell her they all look fine to me. That was the way most school districts assessed teachers until recently, when many critics, including me, said that wasn’t good enough.

School districts, particularly in the Washington area, are now spending much time and money building complicated systems to identify the worst and best teachers, and some gradations in between. They are finding this hard to do. I am beginning to wonder if it’s worth so much effort.

My colleague Michael Alison Chandler recently reported that Maryland is having so much trouble designing its new evaluation system that it has asked for a year’s extension on its $250 million federal Race to the Top grant to finish the job. The D.C. schools are two years into their new system, called IMPACT, but good teachers and principals I know have found it unnecessarily aggravating. Its effect on learning remains unclear.

When I checked around the Washington area in 2009, I found these percentages of teachers rated satisfactory: Alexandria: 99 percent, Calvert County: 99.8 percent, Charles County: 98.4 percent, Fairfax County: 99.1 percent, Falls Church: 99.55 percent, Loudoun County: 99 percent, Montgomery County: 95 percent, Prince George’s County: 95.56 percent and Prince William County 98.3 percent. (D.C. did not have such data.)

The standard evaluation method, infrequent classroom visits by principals, seemed to make no distinctions. Policymakers decided identifying clear differences would allow schools to persuade weak teachers to improve or leave and to give strong teachers more money and opportunity.

Having read new research on one of the model evaluation systems, however, I am no longer sure those changes are going to do much for student achievement.

A study by Thomas J. Kane, Eric S. Taylor, John H. Tyler and Amy L.Wooten published in the quarterly Education Next says a decade-old system in Cincinnati seems to have found a smarter way to use classroom observations. Like the new D.C. system, Cincinnati’s has both independent evaluators and school administrators watch teachers in action and grade them on a four-point scale. The researchers used student test results (which factor into D.C.’s IMPACT system but NOT the Cincinnati system) to see how teachers with the best and worst classroom assessments affected academic progress.

The differences were less than I expected. The study said a student who began the year at the 50th percentile and got a teacher rated in the top quartile would score on average only three percentile points higher in reading and two points higher in math than a similar student assigned a teacher in the bottom quartile.

Relatively speaking, the researchers say, these are big jumps, but I have seen greater gains in schools that do not have expensive, time-consuming teacher evaluation systems. They rely instead on unusually rigorous recruiting, selection and training of principals. The school leaders who survive the process are given the power to hire and fire staff but lose their own jobs if achievement does not rise significantly. Those schools often have longer school days that allow team-minded teachers to confer with each other and provide creative, consistent ways of teaching and disciplining kids.

The complex new teacher evaluations are hailed for facilitating incentive pay for the best teachers. But such bonuses have a downside: fostering jealousy and resentment that can cripple teamwork. The educators who have influenced me say they like identifying weak teachers so they can be steered to other careers, but they feel a good principal can do that without a ton of evaluation paperwork, particularly if the school is small.

The most promising feature of the Cincinnati system is that it is infrequent. Teachers are reviewed in only their first and fourth years and every five years after that. School districts determined to distinguish bad and good teachers might be better off starting slowly like that and focusing instead on finding and empowering principals who can do the job without so many forms and calculations.