Wednesday, September 08, 2004

Arrow's Theorem, Voting, and Ranking Schemes.

In the light of Lance's post about voting schemes, it seems a good time to mention Arrow's theorem, the "impossibility theorem" for voting schemes. Arrow's theorem (and the related area of social choice theory) have become useful tools in computer science as well over the past few years.

The setting for Arrow's theorem is a set of rankings. Individuals (voters) order a set of options (candidates) via preferences, and the goal of a ranking scheme (voting system) is to come up with a preference list that respects certain properties:
  • universality: the output should be a total order
  • non-imposition: every possible order should be achievable
  • non-dictatorship: the global preference should not merely follow one of the input rankings
  • monotonicity: raising the ranking of a choice cannot hurt it in the overall rankings
  • independence of irrelevant alternatives: the "spoiler" condition - rankings of options outside a given subset should not affect the ordering within the subset.
Arrow showed that all of these cannot be satisfied simultaneously (with at least two voters and three options). Incidentally, Arrow proved this in his Ph.D thesis, and it is one of the contributions mentioned in his Nobel Prize citation from 1972. His major work is most notably in the realm of pricing, and the Arrow-Debreu theorem on market pricing is a fundamental result that is familiar to people working in the area of mechanism design and auctions.

But social choice theory itself has been exploited in a CS context. To understand this, we need another notion, the Condorcet criterion.

One of the criticisms of Arrow's theorem is that the "spoiler" condition is too stringent, and other (weaker) conditions have been proposed that allow for a general voting procedure to exist. The most notable scheme is one that was developed nearly 800 years ago, and was rediscovered by Condorcet in the 18th century (which leads to a topic for another post). The idea is to create a ranking by looking at a series of face-offs. The relative ranking of options A and B is determined by seeing how many voters rank A above B, and taking the majority opinion. This is done for all pairs.

The Condorcet winner is then the option that in a head-to-head matchup beats all other options. Note that a Condorcet winner may not always exist, and this is the problem with using the Condorcet method in its basic form for elections.

The Condorcet criterion is a condition on a ranking scheme.
If any element beats all other elements in head-to-head matchups, it should be ranked first.
A generalization of this, the extended Condorcet criterion, states that:
If there is a partition (C, D) of the set S of options such that for each x in C and each y in D, x beats y head-to-head, then x should be ranked above y.
So what does all of this have to with computer science ? The application comes from the problem of merging ranked lists. The most common example of this arises when doing meta searches in a number of search engines. If Google ranks pages a certain way and Teoma does it a different way, how should the metasearch engine present results ?

The Condorcet criteria provide a condition that any reasonable ranking scheme must satisfy. The algorithms are generated by defining a metric on rankings, after which the consensus ranking is a good "center point" in the induced metric space. Specifically, if we define the "distance" between two rankings as the "bubble-sort" distance between them, or the number of pairs on which they disagree, then the Kemeny optimal ranking is the ranking that minimizes the average distance to all the input rankings. A nice property of the Kemeny optimal ranking is that it is the unique ranking satisfying the extended Condorcet criterion while having other desirable properties as well.

All of this is explained in some detail in the pioneering paper by Dwork, Kumar, Naor and Sivakumar. When I was at VLDB I saw more examples of these methods being used to merge the results of multiple rankings. I should add that the problem of merging different rankings comes up in many different settings, so it is worth knowing the background in social choice theory.

Post a Comment

Disqus for The Geomblog