The controversy at Yale began after two students made it easier to access the numerical evaluations of courses. These data were already available through Yale’s official course-selection site. But the students’ tool made it much simpler to compare the ratings for each course — and each professor.
That’s what led administrators to shut down the site last week. It would encourage students to select courses based solely on raw scores, officials said, rather than on the more nuanced student comments that are also collected in the evaluation process. And helping students choose courses wasn’t the real goal of that process, anyhow. Yale collects student evaluations “as a way of helping faculty members improve their teaching,” one administrator wrote, “not as a course selection tool.”
That would be fine if student evaluations illustrated teaching effectiveness and the areas where professors needed to get better. But they don’t. In fact, some evidence suggests that professors who receive high evaluations are actually worse teachers than their peers.
In a 2010 study at the Air Force Academy, where a standardized curriculum allows for convenient natural experiments, professors who got high marks from students tended to give out higher grades — and their students did worse in subsequent classes. For the more demanding professors, it was the opposite: They generally got lower student evaluations, but their students did better later on.
In other words, professors with lower expectations got rewarded, even though their students didn’t learn as much. The professors who assigned more work taught their students more, but they were punished for doing so.
So should anyone be surprised that students’ grades keep rising? About 43 percent of college letter grades in 2011 were A’s, up from 31 percent in 1988 and 15 percent in 1960, a 2011 study found. Over roughly the same span, the average amount of studying by people enrolled in college declined almost 50 percent, a 2011 study found, from 25 hours per week to 13 hours.
To be fair, student evaluations aren’t the only reason for grade inflation and decreased academic workloads. But they’re certainly part of the problem. Faculty know that the best route to a positive evaluation is giving an easy A. And students know that they’re more likely to succeed in the highly rated classes.
If teachers really cared about what students were learning, we’d devise systems to measure that — and our own instruction — more carefully. Some institutions have started to do this. In 2010, 71 college and university presidents signed an agreement pledging to expand their student assessments. At some schools, students submit examples from their coursework for evaluation; at others, they take tests designed to gauge how much they have learned.
But lots of schools are resisting. At New York University, where I work, there is still no sustained or comprehensive effort to determine what students learn at our home campus or a dozen overseas sites. I’ve been fortunate to teach at several of them, and I like to think that I’ve done it well. But I really don’t know, and neither does anybody else.
Back at Yale, the students who devised the new Web site have become a cause celebre in the blogosphere. Ditto for a third Yale student, who wrote a program that posted the university’s numerical course evaluations in a way that the school couldn’t block.
The administration eventually backed down, admitting that it “erred in trying to compel students to [use] the complete course evaluations” instead of just the numbers. “In the end,” one official wrote, “students can and will decide for themselves how much effort to invest in selecting their courses.”
That’s true. But in the end, the administration and faculty — not the students — will decide how much effort to invest in discovering how well we teach and how much they learn. Yale officials were wrong to shut down the students’ Web site, but they were right to question student evaluations as a measure of course quality. The big question is whether we can come up with a better measure and whether we care enough to do so.