Angela Lee Duckworth, a 2013 MacArthur fellow, helped popularize the idea that “gritty” children are more likely to succeed. (Zave Smith Photography/Courtesy of the John D. & Catherine T. MacArthur Foundation.)

This story has been updated.

A growing number of education policymakers are interested in judging schools and teachers in part on whether they contribute to the growth of children’s character, pointing to research that shows a solid link between success and the strength of traits such as self-
control and resilience.

But now two of the researchers who helped establish that link are sounding a note of caution, arguing in an essay published Wednesday that existing measures of children’s character traits aren’t ­sophisticated enough to be used for decisions that would affect the lives of teachers and students.

“We have a simple scientific recommendation regarding the use of currently available personal quality measures for most forms of accountability: not yet,” the researchers wrote in the journal Educational Researcher.

One of the authors is Angela Lee Duckworth, a University of Pennsylvania researcher who won a 2013 MacArthur “Genius” award after establishing the notion that a student’s success in and after school is correlated to that student’s level of self-control and “grittiness,” or ability to keep working toward long-term goals.

“School districts and state legislators are in some cases so enthusiastic about grit and mind-set that they want to tell people to measure it, to make salaries dependent on it,” Duckworth said in an interview.

Duckworth outlined her view with fellow researcher David Scott Yeager of the University of Texas. Yeager studies how a child’s mind-set — his or her belief in whether abilities are fixed or can be improved — affects outcomes.

Character traits, such as tenacity or emotional intelligence, can be measured accurately enough for research, the professors argue, but not accurately enough to hold schools accountable or evaluating teachers and principals.

For example, student questionnaires are a common way to measure a student’s “non-cognitive skills,” such as persistence and hard work. But there’s a paradox: The lower a school’s standard for any one quality, the more likely a student is to rate him or herself highly. As school standards rise, students are more likely to give themselves low ratings.

“It’s basically a statistical law,” said Yeager, pointing out that on international exams, students in countries with low performance tend to rate themselves highly on measures of conscientiousness. “Anyone who becomes an expert at something becomes better at assessing their flaws in that thing.”

Duckworth, a former middle school teacher, is known for helping to popularize the notion that a student’s success is correlated to that student’s level of self-control and “grittiness,” or ability to keep working toward goals.

Her research has shown that grittier students are more likely to graduate from high school, score higher on SAT and ACT exams and be more physically fit. Grittier students also are less likely to get divorced, and they typically experience fewer career changes.

KIPP charter schools have become known for their approach in this area, called social-emotional learning or character education. But it has become more common around the nation as schools seek ways to improve outcomes, especially for struggling students.

The researchers argue that if schools are judged according to the results of student self-ratings, schools that are worst at teaching social-emotional learning are ­going to appear to be the best and are going to be asked to share their tactics with other schools that are actually doing better. Teacher questionnaires are another measurement tool that shouldn’t be used for judging teachers or students, the researchers say. Teachers’ definitions of “self-control” at one school might be different from the accepted standards at another.

Some researchers assess a child’s traits via a task such as the “marshmallow test,” which measures how long a young child can withstand the temptation of a marshmallow when alone in a room. Such tasks have promise, the researchers said, but they are still too weak to serve as the foundation of decisions about a teacher’s employment status.

One key problem: Teachers whose jobs are at stake might feel pressure to fudge their evaluations of students. Another problem: When teachers repeat tasks in order to measure growth over time, students catch on — the second time they sit in a room with the marshmallow is a different experience than the first.

The flaws in existing measurements mean they should only be used cautiously, if at all, for diagnosing a problem with a student’s character or for judging whether a particular program is working, the researchers argue.

Some states and districts have begun incorporating measures of character in their systems for judging schools’ success, according to Yeager. Recent federal grant competitions have asked applicants to explain the character measures they will use to gauge whether their approach to change is actually working.

At stake are millions of dollars, Yeager said, “and the measures that people are likely to use are not suitable for evaluating program effectiveness.”