I sat down recently with Jason Kamras, chief of human capital for DCPS and the principal architect of the IMPACT teacher evaluation system that is now in its second year. We discussed possible changes in IMPACT and the use of “value-added” methodology--holding teachers accountable for students reaching or exceeding predicted growth on standardized test scores--to measure effectiveness. This is the first of two parts on our conversation, edited for length and clarity.
BT: The number of teachers who actually had value-added as part of their evaluation last year was 476. It’s small.
JK: It is small. You know the grades we test [on the DC CAS, 3 through 8 and 10]. You have to knock off third grade because we don’t test second grade, so you have no benchmark. You have to knock out 10th grade because we don’t test in ninth grade, although that’s changing. You always have some [teachers] drop out because they didn’t have enough kids.
BT: How do you expect that group to grow this year?
JK: It won’t be this coming year, not significantly. We’re piloting the ninth grade assessment. We’re looking at expanding some testing in the lower elementary grades and expanding end-of-course exams in the middle and high school grades. We don’t have everything pinned down yet, but this is part of the conversation, to figure out the three-to-five-year plan to fold in all those pieces. We’d like to get north of 75 percent coverage. It’s going to take us five years to get there.
BT: What’s your take on the fact that only 60 percent of the teachers rated “highly-effective” under IMPACT accepted bonuses for which they were eligible? [Teachers who accept IMPACT performance bonuses give up certain job protections.]
JK: It was about two-thirds. We never expected everybody to take the bonus. The guidelines around the bonus were set out in the contract. It was very clear. It was a contract that was ratified overwhelmingly. Teachers are professionals and professionals get to make choices. I’m delighted that two-thirds took it. The one-third, they made a calculus for themselves and I totally respect that. I think probably some folks didn’t actually think we were going to pay the money. We don’t have a great track record on that. And we did. And so I think more folks this year may think twice about turning it down. The base increases will begin to kick in this year as well, and I think that’s going to be significant for people and they’ll recognize that’s something they’ll want to be a part of. But if people want to keep the extra security, then so be it.
BT: What other student-generated work besides test scores could be used to evaluate teacher effectiveness? [Under IMPACT, 10 percent of the evaluation for teachers not eligible for value-added is based on other forms of student work.] Portfolios of student work are one thing people mention.
JK: We looked at portfolios and lots of other things. And then you’ve got to push a little bit. How do you do portfolios? Everybody has a different idea of what a portfolio is, number one. Actually it’s really hard to demonstrate growth clearly and quantitatively. Who is assessing the portfolios and under what standards? When we looked at it in depth, what we came to was that the operational burden to do this well was simply probably beyond the capacity of the school system at this point.
BT: Operational burden?
JK: If you’re going to really dig into looking at all these pieces of student work, and let’s say we’re going to ask principals to now do this, when exactly is all that going to happen? Then how is it going to be standardized, so that the way they’re looking at your portfolio isn’t different from the way they’re looking at another teacher’s portfolio. So this isn’t to say that this can’t ultimately be worked out. But for where we are right now we felt we wanted to go with the thing we felt strongest about [such as value-added].
BT: Here’s the problem. How do you explain value-added to a lay audience, or even beyond a lay audience, when you’re dealing with Ph.D.-level statistics and math?
JK: Basically think of it this way. A teacher has a set of students. We can, through this formula, figure out what the typical ending score is for kids like this. By kids like this I mean kids who had a similar performance history and some of the similar demographic characteristics, like free-and-reduced-price lunch, special education, ELL and so forth. And through this regression formula [a statistical tool for studying relationships between multiple variables] we can figure out that on average, kids like this tend to end up here. Then we can calculate how did your kids actually end up, and then we compare the two. And that’s it. That is essentially what it comes down to. Now that first piece is a bit of black box, surely. And there’s a lot of math in there. But the math is in there to make it as fair as possible, so that we’re taking into consideration where the kids started, what they’ve done previously and the other things we know are outside the teacher’s control. By doing all that we’re isolating the impact of the teacher.
BT: And you’re comfortable with this, even with what some experts say is the uncertainty and potential for error?
JK: There is uncertainty of course. But you’ve seen the Brookings Report on value-added. A lot of value-added models actually do a better job of predicting future performance than, for example, the SATs do of predicting future performance in college. And yet we use those to make decisions about who gets into what schools. And there’s evidence that they also do a better job, of actually predicting performance than principals’ evaluations themselves. And so what I would try to encourage people to think about is to look in the context of all the measures that are out there. It is imperfect as all measures are, but it’s been shown to be pretty highly predictive. On top of which, it’s not the only thing we look at. Yes, it is 50 percent, but it’s just 50 percent and nobody is going to get fired because they have low value-added alone.
BT: That’s not possible?
JK: No. You have to have low observations [low scores from principals and master educations who sit in on classes] and low commitment to school community [another key IMPACT criterion]. So that’s just not going to happen. There’s actually, as you’ve seen from the documents, a decent correlation between our observations and the value -added. I’ve been to these classrooms [with high value-added teachers] These are good teachers. These are places where you want to put your children. So if it was totally crazy, if you walked into the high value-added classrooms and you saw terrible teaching, it would give you more pause. But to me there’s something certainly solid there.
BT: As I recall it was not a huge correlation, but a modest correlation.
JK: Yes, a modest correlation, but as these things go, relative to similar studies, it’s right in the ballpark of what one would expect. It’s also right the ballpark of similar things in the social sciences.
BT: So do you don’t envision a day where that 50 percent might roll back to 30 percent because you’re using some other measure?
JK: I can’t predict the future. We’re always open to thinking about what the right balance is. We are looking at student measures. The Gates study has done some interesting work on that. They found a strong correlation or at least a decent correlation between student responses and value-added.
BT: Student responses?
JK: Student perceptions of the teacher. It’s not, ‘Do you like your teacher?’ It’s things like, ‘Do you feel your teacher pushes you? Do you feel safe to make mistakes in his classroom? If you get something wrong does he help you explain? That sort of thing,.
BT: So you could see that being a part of IMPACT?
JK: I think eventually. There’s a lot of work that would need to be figured out. You can’t do it with all students. You can’t do it with the little ones. But it’s something certainly to explore.