data

Building a Voter File: Improving the Data

Of course, once you have a file for a given state, the work is not over. Even keeping the file at its current state, let alone improving it, will be an ongoing process.

First of all, from the day it goes up, the data gets staler and staler. Ideally, every state in the union will be undergo the entire process frequently--especially for states with hot races in election years.

Aside from regular rejuvenation, however, a file can be improved upon through contact with reality. No matter how diligent the Secretary of State, certain problems on a file will slip through the cracks. People will move, die, or get convicted; phone numbers will change, or go bad; party registrations will be altered. All of these changes can be captured by a well-tuned field organization, and appended to the file, so that as election day gets closer the file can asymptotically approach perfection (this is a somewhat idealized picture; bear with me).

When volunteers go out and canvass neighborhoods or phone bank, they can verify if an address is attached to the right name or whether or not a phone number is good; they can also gather information that is simply unavailable from other sources, like a person's top issue priorities. All of this information is gathered, centralized, and scanned in, so that the state voter file is as up-to-the-minute as possible. In the past, this was done by hand (when it was done at all); now, the use of new technologies like computers, palm pilots and bar coding of responses has greatly increased the efficiency of doorknocking and other field techniques.

Building a Voter File Part 3: Using the Data (An Overview)

Once you've gone through this process, you should have a list with millions of entries, each containing personal and consumer information--ideally for every registered voter, and all non-registered adults.  So what can you do with it? Plenty.

Once it's compiled, the data has to be accessed.  Various people can be granted different levels of access--making the whole file available to any volunteer would raise serious privacy concerns, not to mention possibly giving access to rival campaigns or, god forbid, the other party.  For low-level volunteers, this access can be extremely limited, while higher-level operatives can be granted more generous permissions.  Broader access can be granted through a web interface like the DNC's Votebuilder, RNC's Voter Vault, or Catalist's Q-tool.  Using some relatively simple Boolean logic, you can create lists of all the people in a state, district or precinct who share certain characteristics--for example, you might want to find all registered black voters under the age of 40.  With a certain (ever-diminishing) amount of inaccuracy, this is a trivial list to pull.

As you can imagine, this is extremely useful.  You can use these tools to do everything from create walk lists for your volunteers to pull samples for polls or blanket a state with direct mail.  Which is why these files are considered so valuable, and why making them is big business--with big consequences.

Building a Voter File Part 2: Appending Overview

Cross-posted from Overdetermined.net. Find the latest entries in the series there!

Once the data is (yes, is, prescriptivists--I went there) in a standardized format, we move from the realm of "interesting" into "faintly creepy".  The information from Secretaries of State or state parties is generally pretty innocuous--name, address, maybe phone number or age.  The appended consumer data, on the other hand, is more unsettling.  There's nothing on there that would do real damage if anyone knew it--no credit card numbers, nothing that people could use to steal your identity--but it can be kind of strange to think who realizes that you own two dogs and a cat.

Most of this consumer data is gathered by for-profit companies, who then retail it to both the state parties and the for-profit companies that are creating these files (if you take a look at our resources page, InfoUSA is one such vendor).  They get their information anywhere they can--state licensing agencies (think it might be worthwhile for the McCain campaign to know who has a gun license?), magazine subscription lists, grocery store value card memberships...basically, if you have to fill out a form for it, somebody wants it, and will get it unless prevented by law. 

Moreover, based on this consumer information, it's possible to predict other characteristics (within limits, which I'll go into in a later post).  For example, the RNC might conduct a truly massive poll that measured all kinds of behavior--TV habits, income, type of location, and lots of other things besides.  Based on that poll, they might determine that there's a high correlation between a given cluster of characteristics and certain behaviors.  For instance: only a survey can tell you how much radio someone listens to.  But it's possible to know for everyone where they live, their age, and whether or not they own a boat.  If all males 54-65 who have boat licenses listen to Rush Limbaugh, it can be a good predictor. 

This use of consumer data is at least a partial definition of the oft-abused term "microtargeting" (this WaPo article, although overwrought, is a good introduction).  Rest assured I'll have more to say on the topic in the future; but this is the overview.  Stick around; tomorrow, I'll go into how this data gets used.

Introduction to Regression Analysis

The campaigns of 2008 are now over and many organizations are left with a nice amount of data from their outreach programs. This post will give a brief introduction to the use of regression analysis to look at your data in new ways.

Simple regression allows you to isolate a single variable, in most of your cases probably a contact action.

When you have a lot of data you can group it into sets where all of the variables match except for your chosen dependent and independent variables.

For example, you would create a set of data of 20-year old female voters from the same precinct. You could then look at that data to see how effective your contacts were with that specific demographic. Once you have created a few data sets you can identify some new relationships. It's possible that a specific contact action was more effective with women than men, or with students than young professionals.

Regression analysis allows you to restrict your attention to a single explanatory variable, which may allow you to see something you might have missed otherwise.

Fighting for a User-Generated Government

Earlier today I wrote a blog post discussing User-Generated Government. Below is a video I recorded talking about the need to work towards that goal.


Syndicate content