Exit polls get a bad rap. I don't have a dog in this fight, but it's true. Exit polls provide remarkably quick data on elections that comes with a smaller and smaller margin of error as an election night passes, telling us who came out to vote for which candidate and why. It's essential, but like so many other polls, we only notice when the data is wrong.
When polls closed in New York on Tuesday, exit poll estimates showed Bernie Sanders trailing Hillary Clinton by only four points. That eventually got adjusted, but it was the most recent example of a poll showing a number that ended up being way off the final result.
To figure out why, I reached out to Joe Lenski, executive vice president of Edison Media Research. Since 2003, Edison has been conducting polling for the National Election Pool (NEP), a group of six media organizations that includes Fox, CNN, ABC, CBS, NBC and the Associated Press. Once upon a time, each of those organizations would have run their own exit polling; now, it's centralized through Edison.
To explain what happened in New York, it's best to start at the beginning: How exit polls work, from start to finish. The interview below has been condensed and lightly edited, but you knew that because you seen these things before.
THE FIX: Let's start by walking through the process.
LENSKI: The six news organizations control the editorial and financial decisions, in terms of how to allocate resources and what questions to ask. So anything about sample size, which races to cover, what questions are asked on the questionnaire, those are determined by the six news organizations themselves. They have committees that write the questionnaires. They have committees that decide which primaries or caucuses are covered, how many precincts are covered.
Once they've told us, we want to cover Iowa with 40 entrance poll precincts, or we want to cover the New Hampshire primary with 45 exit poll precincts, we will then pick sample precincts. We will do the research about those precincts: Where the polling places are, what time they open, what the rules are in the state in terms of where we're allowed to stand in terms of polling. We'll do the hiring and training of local interviewers. We'll do training and rehearsals the weeks before the primaries. We have a computer system set up to take in the data and to process the data in real-time. That operational framework is what Edison is in charge of.
FIX: So for Iowa, for example -- how many folks did you have on the ground doing the surveys in Iowa?
LENSKI: Primary [elections] are tricky because of the distinction between primaries and caucuses.
For primaries, it's the same procedure that we use for a general election. In New Hampshire, we had 45 precincts. You have one interviewer at each of those precincts duirng the day, from when the polls open to shortly before the polls close. They will be assigned an interviewing rate; we don't interview every person. It depends on the size of the polling place. We tell them to talk to every third voter or every fifth voter or whatever. Our goal is to conduct between 100 and 150 interviews at every polling location on election day.
Because news organizations want to see results before the polls close, we have our interviewers stop three times during the day, take the questionnaires and call in the results to our phone rooms, where they're processed. They'll tally the results in terms of how many votes were cast for each candidate, but then they'll also read the full questionnaire which could be between 15 and 20 questions: basic demographics, when people decided to vote, their views on certain candidates. Those will be read in three times during the day, usually once in the late morning, once in the mid-afternoon and once shortly before poll closing.
FIX: This is a paper process?
LENSKI: For the NEP, these are all done on paper.
As a voter is approached as they exit the polling place, they're asked if they'll fill out a questionnaire. Typically we'll get a 40 to 50 percent response rate, which when you compare to telephone surveys -- single-digit response rate -- or online survey, we're actually very happy with a 40 percent response rate. For every voter who declines to fill out a questionnaire, our interviewer will record visually their age, race and gender. So we do know the response rate by [demographic]. We can and do adjust the results by non-response rates. Typically, younger voters are more likely to fill out an exit poll than older voters.
The questionnaire itself is filled out by the voter. It's private, it's confidential, it's anonymous. They fill out the 15 to 20 questions, they fold it up and put it in a little ballot box with the logos of the six news organizations. There are no identifying marks on the questionnaire, so it's a completely anonymous and self-administered process.
A question like party identification -- do you consider yourself a Democrat, Republican or independent -- that is a self-identified party ID. That gets confusing because in a case like New York, it was a closed primary but 18 or 19 percent of the voters identified themselves as independent. How can that be? We didn't ask how you're registered; we asked how you identify yourself.
When those results are called in, the tallies are processed, the refusals are processed, the questionnaires are processed in real-time and we're reporting that data to all those organizations in real-time.
FIX: How are caucuses different?
LENSKI: The only caucuses historically that we've covered are the Iowa and Nevada caucuses. In a caucus you can only get people as they're lining up to got into the caucuses, because once the caucus is over, the results are known.
So what we do in those cases is we conduct an entrance poll. It's the same type of technique. We make it a shorter questionnaire; an entrance poll is usually just one side of one page so they can fill it out faster as they're going in. In those cases, we have two interviewers assigned. They'll be interviewing, and then every so often one interviewer will take the results and call them in, while the other is still interviewing people.
The other difference between entrance polls and primaries is when the primaries for each party are on different days. You have a different type of sample. The samples we're picking in South Carolina or Nevada [which were held on different days] for the Republican side or the Democratic side are picked based on past voting history by party.
FIX: So you've got the data in hand. How do you proceed?
LENSKI: As we get data in before 5 o'clock, we're processing the data. We're adjusting for the response rate by demographics. We're also adjusting based on anticipated size by region in the state. That's mostly by past voter history, but we're also using our sample precincts. Every time our interviewer makes a call, before they call, they go up to the election official and ask how many people have voted so far in the precinct. We use that data to make an estimate on the size of the expected vote at each location.
Up until 5 o'clock, the data is only seen by people that are in a quarantine room. Up to three representatives of each of the news organization are in a room. They give up their cellphone, they give up Internet access, they give up basically the ability to communicate to the outside world, and all they get to see is the exit poll data. After 5 o'clock, the quarantine is released and those representatives go back to their organizations and brief the other reporters and analysts on what the exit polls show.
To that point, they're usually only seeing the first two-thirds of the data -- the morning interviews and the afternoon interviews. We know there are different voting patterns by time of day. What they're looking for at that point is a general idea of what the story is going to be that evening, what to prepare for and some of the demographic or issue trends. You'll see reported between 5 o'clock and poll closing what issues were most important that day, or candidate quality. The types of questions that wouldn't necessarily characterize the outcome of the race.
FIX: And there is a prohibition against saying that, 'exit polls suggest that so-and-so will win' before the polls close, correct?
LENSKI: Right. The news organizations have made a pledge to Congress that they will not use the exit poll data to characterize the outcome of a race until the polls in that state are scheduled to close.
That brings us to the third wave at the poll closing time. What we also have, in addition to the exit poll data, we also have sample precinct data. The samples are picked similarly. In a state like New York, we had 35 exit poll precincts and 50 sample precincts in the state. The sample precincts include the exit poll precinct. We have a reporter in each polling location in the sample, and as soon as the results are posted at that polling place, they will call those results in.
Shortly after poll closing, we can quickly compare what the exit poll of that precinct said the votes were and what the actual votes were. So that's when you'll see a fairly quick adjustment to the exit poll estimates after poll closing.
Like in New York, we were showing a four-point margin in the exit poll at 9 o'clock, but by 9:45 we were showing a 12-point margin. That's because we can quickly compare precinct-by-precinct what the exit poll results were and what the full results for that precinct were. So we're seeing precinct-by-precinct that the actual results were that Hillary Clinton was doing four points better than she did in the exit poll in that precinct, we will adjust the results [of the exit poll] accordingly.
There are two important uses of the exit poll. One is to project a winner. But the main use of the exit poll that night and historically is to have the most accurate representation of the demographics of voters. How each demographic voted, what the issues were, when people decided how to vote. To make those demographic results as accurate as possible, we want to match to the actual results by precinct, by region of the state, etc.
When you see those adjustments made shortly after poll closing, that's because we've gotten a whole lot more real information to tell us what the turnout is going to be by region, the overstatement and understatement for each candidate in the precinct was based on the actual results. We're making those adjustments as rapidly as we can.
FIX: So the New York issue, I can't help but notice that you said that younger people are more likely to fill out the surveys and then also that Sanders was over-represented in the initial estimate. Do you think there's a link there?
LENSKI: Oh, yes, there's definitely a link there.
We're adjusting for that throughout the day. As I said, we know the response rate in our 35 precincts. We know that younger voters are more likely to choose to fill out the questionnaire than older voters, and that's typically the case so we're already making those adjustments.
Obviously in this case, that was even more than normal. As soon as we started getting sample precinct returns, we made that adjustment even more so that we'd match the actual results.
There are other issues here. We're trying to make our best estimates on turnout by region of the state. In New York City, our exit poll reporting of how many people had voted in our precincts showed New York City making up 45 percent of the vote in the Democratic primary. It ended up making up 51 or 52 percent. So that's another adjustment we're making after poll closing, and doing the best we can with incomplete data before the polls close.
In Michigan, we actually had exit polling all day showing Bernie Sanders up by two points though every pre-election poll had Hillary Clinton up by 10 points or more -- so we're sitting out there on a limb the other way. In that case, the exit poll was right and the pre-election polls were wrong. It happens both ways.
While everyone is talking about the Democratic side, we went out at 9 o'clock saying that Donald Trump was going to get 58 percent of the vote. He got just about 60 percent of the vote. Everything we did on the Republican side hit the mark. I understand when the data moves as much as the Democratic data moved between 9 o'clock and 9:45, that causes a lot of consternation out there. But there are plenty of other states where we've been right on.
It's a survey like any other survey. There are sampling issues, there are non-response issues, etc. We're making the adjustments we can to make this data as accurate as we can with the incomplete information that we have.
And as time goes on, we have more and more complete information.