## Friday, November 05, 2004

### The 'Purple Haze', revisited.

[Update: Nov 8]: More color schemes (one based on colorbrewer.org) and a toolkit to make your own maps !

By now, Robert Vanderbei's purple map of the voting counts in Election 2004 has crisscrossed the internet several times. On his web page, he answers some questions about these maps:
Can you warp the counties so that each county's area is proportional to its total vote count? Such warped maps are called cartograms. There are already several of these at the state-by-state level on the web. I haven't seen any at the county-by-county level. A few years ago I collaborated briefly with David Dobkin and Stephen North on algorithms for producing cartograms. I can say that making a cartogram with so many individual elements (counties) would be very difficult.
Cartograms are indeed hard to compute: this is an interesting geometric problem, as Jeff Erickson also points out.

As it turns out, the place I work at does cartograms, and quite well at that ! Stephen North (the one mentioned above) and colleagues have developed methods for computing cartograms and I used their approach to create a cartogram of the election results. I used the data from a paper by Daniel Keim, Stephen North, and Christian Panse that uses the medial axis to construct a cartogram.

[UPDATE: the original maps had a problem in labelling Nevada because of a mismatch between county names in the two data sets: the new maps shown here reflect the correction. Thanks to the commenter who pointed this out.]

I used a variety of color schemes: for more information, and larger pics, see my new cartograms page. (note: for formatting purposes I used HTML to scale the images; if you download the image, it will be bigger)

1. A Vanderbei-like color scheme, where the relative proportion of votes for Bush or Kerry turns the county red or blue respectively: purple regions are roughly even.

2. The winner-take-all color scheme: a county that Bush won is marked in red, and one that Kerry one is marked in blue.

3. Red-blue (suggested by a poster on the blog). Start with a baseline of white for equal vote-share. as the vote percentage for the winner increases, increase the strength of red or blue appropriately.

4. Grayscale (suggested by Kathryn Myronuk). Start with a baseline of white for equal vote-share. As the vote percentage for the winner increases, make the color grayer.

5. ROYGBIV (suggested by Kathryn Myronuk). Again, white is neutral, but go towards the R side or the V side of the color spectrum depending on who wins and how much (thresholds at .7, 0.6, .53)

It takes a while getting used to a cartogram, and having to ensure that counties still touch after being inflated can make the problem quite difficult to solve while still retaining an overall representative shape. However, it is interesting to see how large California and the North-east really are in context, and how the entire middle-west of the country shrinks.

As far as I know, the complexity of computing a cartogram (where the typical formulation would be to minimize some error metric on the discrepancy between the area of a region and weight associated with it) is unknown, and is probably NP-hard (I once thought I had a proof for NP-hardness for rectangular cartograms, but it foundered). However, much of the challenge in cartogram design comes from trying to balance accuracy and aesthetics; the map when distorted should still look like the original map !

For more on this, check out the AT&T Info Viz page on spatial data transformation.