Fourth in an occasional series

The Democratic Party already knows a few things about the Victorian-style duplex on West Hubbard Street. It wants to know a lot more.

And so Mark Rutkus and Patrick Harris, armed with sturdy clipboards and cheerful smiles, are going on a data hunt. "Hi, we're here from the Franklin County Democratic Party," Rutkus begins as Linda Houston draws back her front door, tentatively at first. "Can we ask you a few questions?"

A longtime Democrat, Houston greets them enthusiastically. For the next few minutes, the local party operatives ask and she answers: Are you registered to vote? Do you reside at this address? Can you confirm the names of the other people living here? What do you know about the people living next door?

It may seem like basic shoe-leather canvassing, unchanged from the days when precinct captains kept their political machines oiled with up-to-date information gleaned from doorstep and barstool encounters. But in this year's election, there is a hidden high-tech twist. Rutkus and Harris are out to "map" the political demography of this neighborhood, trolling in the service of a quasi-science called "database targeting."

Houston's answers will bounce from Rutkus's clipboard to a computer in the state Democratic Party's offices here, and then 400 miles away to computers housed in the Democratic National Committee's headquarters in Washington.

Like rivulets flowing to rivers and rivers to the sea, this information will join an enormous data torrent streaming toward Washington from all around the country. Houston's "profile" is just one of 166 million -- or one for every registered voter -- that the DNC is constantly updating in a huge digital cache known as DataMart. The Republican National Committee tends a similar information trove, dubbed Voter Vault.

The fight for Ohio, and maybe the election itself, could come down to a battle of these databases.

'Very Powerful Tool'

The 2004 election will be the first presidential election in which both national parties use their database and number-crunching skills to shape their organizing and get-out-the-vote strategies.

Marketers have used databases to target customers for years -- they know enough about your credit history to offer you that low-interest credit card -- but the political world is just becoming acquainted. For several years, largely out of public view, the two major parties have been assembling their infobanks, each with the same daunting goal. By tracking the electorate, and employing ever more sophisticated statistical models through the field called "data mining," the parties and their candidates hope to zero in on who will vote, how they might vote, and how to persuade them to vote for Republicans or Democrats.

"You could ask me about any city block in America, and I could tell you how many on that block are likely to be health care voters, or who's most concerned about education or job creation," said DNC Chairman Terence R. McAuliffe. "And I could press a button and six seconds later you'd have a name, an address and a phone number for each of them. We can then begin a conversation with these people that is much more sophisticated and personal than we ever could before."

It is not quite that simple. Models and databases offer better-educated guesses, not certainty, about what a voter thinks and how he or she is likely to behave. But with enough computing power, enough personal details and the right search features, political database pros say they are improving the efficiency of an array of campaign decisions, including fundraising, advertising and get-out-the-vote operations.

Using little more than an off-the-shelf program and the desktop computer in his Washington office, Democratic consultant Hal Malchow shows how to predict turnout and target pools of supporters. Using the 2002 gubernatorial race in Arizona as his example, Malchow is able to match the poll responses of 5,778 likely voters against their database profiles. The program then slices and dices the data to uncover the characteristics -- in this case, middle-aged Hispanic men living in two metropolitan areas -- that defined the biggest groups of people likely to support Malchow's client but still uncertain about voting. A quick search of a voter database would return the names of those who fit this profile, making them the likely recipients of phone calls or a knock on the door by a candidate's field staff.

"This doesn't improve [a candidate's] message one bit," said Malchow, a direct-mail expert who has been a pioneer in such targeting techniques. "It doesn't change the way a candidate looks or his personality or where he started in the polls. . . . But it can be a very, very powerful tool. In the end, it's about having knowledge that allows you to use your resources in the smartest and most efficient way."

This fall, thousands of people such as Linda Houston -- who live in a county that Democrat Al Gore won by 4,156 votes in the 2000 presidential election -- will receive "customized" appeals from the parties, courtesy of the databases.

Those whom computer models have identified as, say, education voters may get a knock on their door from a teacher, who will talk up Sen. John F. Kerry's ideas about education. Senior citizens concerned about Medicare or Social Security might get a phone call on that topic from President Bush's volunteers, and perhaps campaign literature highlighting the president's views. Groups, such as the Democratic-leaning Emily's List and the conservative National Rifle Association, have their own database-driven efforts, piggybacking on the two major parties' electronic files in an effort to find and motivate voters.

Because of programs that sift and cross-reference reams of data, database jockeys are starting to discover some surprising behavioral nuggets in their info-mountains. By analyzing lists of Democratic donors and consumer data last year, Malchow found that people who live in households without a call-waiting feature on their phones are more likely than average to respond to a political fundraising pitch.

Malchow is not sure why, but he theorizes that people who do not have call waiting are older and have a more leisurely lifestyle -- the kind of people who might take a few moments to think about and contribute to a political cause.

Room for Error

The DNC's database team has used modeling programs to project the top issues for groups of voters based on common personal characteristics. For example, the DNC estimates that health care is the top priority of 940,000 people in Ohio. It has also projected where these people live among the state's 88 counties, providing a valuable road map for campaign advertising.

Laura Quinn, the DNC's technology guru, cautions that the identity of each "health care voter" is based on statistical probability -- that is, it is a likely identification, not an absolute one (for competitive reasons, the DNC will not detail the cluster of attributes that marks someone as a health care voter).

Predicting voters' thinking this way is hardly foolproof. As a rule, more information about a person helps improve the accuracy of assessing which beliefs they hold. "You could take all the obvious things about a person and still not screen out the important variable," said Doug Kelly, one of the architects of DataMart and the DNC's donor database, known as "Demzilla."

"I might live across the street from a guy. We're the same age, we have the same approximate house value, same family size, same education, maybe even the same minivan. But he's a Republican and I'm not," Kelly said.

Even so, Quinn said, this kind of "statistical oddsmaking" is more reliable than the broad assumptions made about voters before. "We're no longer just speaking about 'women voters' or 'minority voters,' " she said. "The closer we can get to the real circumstances of people's lives, the better. What is more telling about a person is not how they feel about President Bush, but how they live and what they say about themselves."

In other words, Houston's address is not just a household on a precinct map; in the database, Houston may show up as many things: a Caucasian, a woman, a mother, a grandmother, a homeowner, a Democrat, a resident of a mixed urban neighborhood of renters and homeowners that turns out for liberal candidates by a 2 to 1 margin.

A Complicated Task

Consumer marketers have been profiting from this kind of information for many years. Supermarket chains review data on purchase patterns, collected through "shopping clubs," for clues about how to "micro-target" new products to shoppers. Internet companies such as Amazon.com keep records on what customers bought to offer them deals on products they have shown interest in before.

But predicting which book or brand of breakfast cereal someone might buy is easier than figuring out how millions of people will vote -- or even whether they will vote -- several months before Election Day.

The task is complicated by the fact that accurate data on many voters are not readily available. Information in state voter files -- the foundation of the national databases -- varies by state. And every state protects the most basic and compelling political fact -- whom someone voted for. The only way to find out is to ask people directly, and that is an expensive and time-consuming job.

The parties are not even sure who is a Democrat or a Republican. Since only about a third of Ohio's voters are registered with a party, the vast majority of the state's electorate has no clearly marked partisan "trail."

So they ask. This is where field workers such as Rutkus and Harris come in. By marching door to door, they are able to assess whether a person marked as an "I" for independent on the voter rolls might be amenable to overtures from the Democratic Party (the two operatives ignore people listed on voter registration forms as Republicans).

Even without finding anyone home, the door-knocking exercise can offer some telling factoids. When Rutkus spots a placard for a labor union in the window of one townhouse, he notes this on his clipboard.

The two men also have another key assignment: verifying information. Individual data are notoriously volatile, as people move, age, marry, divorce or have children, and so much effort is geared toward weeding out old information -- "deadwood." During the 2000 election, before the creation of DataMart, the Florida Democratic Party's files on voters were in such a primitive state that Democrats could not contact 1.48 million registered supporters on behalf of Gore and statewide candidates.

"Almost a million and a half people never got a letter or a phone call from us because we had the wrong address or phone number," said McAuliffe, who upon becoming DNC chairman in early 2001 ordered changes. "And these were Democratic people! It was reprehensible."

Exactly what the parties have in their databases on Ohio's 4.7 million voters is closely guarded (DNC officials spoke in general about their files; the RNC declined to comment). But for starters, according to several sources, each file duplicates what is already available through state voter-registration rolls: name, gender, date of birth, address, county, state and federal congressional district, date of registration, party of registration (if any), and number of elections voted in.

This information has been abetted by block-level census data and lists sold by commercial brokers that give the parties a general fix on marital status, ethnicity, educational level, the number of people living in each house, estimated home value, the length of residency, and whether a person rents or owns the residence.

Through their record-keeping, plus list swaps with other organizations, the parties know who has made political contributions or charitable donations. Lists of club and organization memberships, plus self-identifying groups organized by the campaign, fill out to the picture. The Bush campaign has about 30 "affinity" groups it communicates with periodically, including African Americans for Bush, stock-car racing fans and snowmobile enthusiasts.

What the parties do not keep, officials in both parties said, is information on consumer behavior, such as credit reports, automobile ownership or magazine subscriptions. "There's a lot of information that's useless in a political context," one operative said. "We don't really care who bought shoes at the Gap."

Republicans and Democrats alike agree that the RNC, which began assembling its database several years before the DNC did, has been more effective in using its information. The party's 72-Hour Task Force -- a voter-registration and get-out-the-vote program -- relied on block-by-block data during the 2002 congressional elections and in successful gubernatorial races in 2003 in Kentucky and Mississippi. Spooked by those efforts, and formerly dependent on state party lists alone, the DNC has hustled to catch up and is "almost at parity now," said Michael Cornfield of George Washington University, an expert in online politics.

Larger Role May Be on the Horizon

As candidates get better at tailoring multiple messages for disparate groups of people, some worry that a kind of Tower of Babel effect could take over: Voters who are members of one targeted group will not know what is being said to another, and vice versa.

"It doesn't bode particularly well for democracy if everyone isn't hearing the same message," said Beth Givens, director of the Privacy Rights Clearinghouse, a nonprofit consumer advocacy organization based in San Diego. For example, she said, it would be deceptive if a candidate sent a highly inflammatory message to people identified as strongly anti-immigrant while appealing to the mainstream with more moderate rhetoric.

Another potential shortcoming: As databases enable candidates to refine their get-out-the-vote programs to people identified as their most likely supporters, indifferent or less committed voters may be bypassed. Why bother trying to persuade someone whom a computer has tagged as a lousy prospect?

But Givens said this could work the other way, too. "Some of these strategies could bring more people to the polls if [a candidate] reaches out to people who weren't being addressed before, with messages they hadn't heard before."

As the field matures, most involved in it believe database-marketing techniques will assume an ever-larger role in campaigns. "These databases are going to be at the heart of what political parties are in the 21st century," Cornfield said. "They'll be able to say to their candidates and their [allied organizations], 'We'll give you the data you need; we have the most up-to-date stuff.' "

For decades, he said, the national parties' most important political commodities were manpower and money. Increasingly, he said, "it's information."

Patrick Harris, left, Mark Rutkus,

and Debbie and Tim Ambro prepare to canvass in Columbus, Ohio.Mark Rutkus of the Franklin County Democratic Party meets Linda Houston and her granddaughter at Houston's house.