Interview with Professor Alessandra Casella
Voting systems have interested both economists and political scientists for centuries. Scholars have been motivated to answer the question, “to what extent can a voting system truly represent voters’ preferences?”? This question becomes particularly interesting when the voters of a large democracy are presented with a wide range of candidates to choose from.
Among the most common systems used by democracies is majority voting, which as its name indicates, determines the victorious candidate based on the majority of votes. However, economic theory points to some serious issues with systems that attempt to accommodate voter preferences when voters are provided with at least three alternatives. More specifically, the well-known Arrow’s Theorem postulates that under a certain set of conditions, there is no ideal voting system in a democratic scenario. For example, one of the conditions required for Arrow’s Theorem to be valid is independence of irrelevant alternatives (IIA), in which voters’ rankings between any two candidates should depend on how these two candidates are ranked by all and on nothing else. That is, voters’ preferences between two specific candidates should not depend on how both candidates fare relative to the other alternative candidates. Conditions like IIA, although seemingly technical and abstract, are fundamentally inherent to democratic elections. Conditions like IIA also imply that there exist scenarios in which voting systems will not function optimally.
To the layman, this can be a disconcerting statement. Democratic choice theory has deep implications for our understanding of modern political institutions. Professor Alessandra Casella, Professor of Economics at Columbia University and a fellow of the National Bureau of Economic Research, and the Center for Economic Policy Research, helps us to understand these implications here.
Shambhavi Tiwari, CPR: Why are the mathematics and related theory behind voting procedures so important? In other words, why should the layman care about these technicalities?
Professor Alessandra Casella, Department of Economics, Columbia University: Voting procedures are so important, and these technicalities are just a way of examining the problem in a more transparent manner. Arrow’s Impossibility Theorem tells us that any aggregation procedure of individual preferences (as long as we have at least three alternatives) will run into fundamental contradictions. That’s an extremely important result, and one that we need to keep in mind when we judge how different voting rules and different systems of making collective decisions function.
Our systems are misguided. But also, one can’t simply say ‘our system is a disaster’ or ‘we get these paradoxes’, or ‘clearly its pathological, if only we had this other system that this other country had.’ It would be too easy! There isn’t an ideal system. So the results of Arrow’s Theorem can be really important and fundamental, because when the properties it lists as desirable are not satisfied, apparently small changes in voting procedures can change the entire outcome. You could insert a spoiler candidate in the elections and determine who wins, even though this candidate himself or herself would not win. You could modify the order in which people vote over the same alternatives and that can affect the outcome. Or you could design procedural rules that can end up determining the outcome of the election. The math makes this fundamental problem simpler, more transparent, and also more believable. It basically tells you that there are simple, desirable, eminently plausible properties of voting, which are completely self-contradictory.
ST: We just spoke of Arrow’s Theorem, which requires a certain number of conditions. Why do we need all these conditions, anyways?
AC: If these conditions did not apply, a lot of really pathological outcomes are possible. Considered individually, these conditions all seem unobjectionable. Firstly, a non-dictatorship seems to be a desirable requirement for a democracy. Another condition is universal domain: your voting procedure should be able to handle any type of preferences that individuals might have, as long as these preferences exhibit some minimal coherence. You want your decision to be transitive because if your decision rule is not transitive, then you can distort your decision by having procedural rules that then end up being extremely powerful: when do you stop voting, what is the order of the votes, who decides the agenda, all start to determine the outcome. You want your decision to be complete because you want to be able to always reach a collective voice. You also need unanimity, which seems very natural. If you don’t have independence of irrelevant alternatives, then the voting rule or the social preference rule becomes manipulable, so people will be inclined not to reveal the truth and instead will act strategically. The type of candidate preference ranking you will get will depend on the specific voting rule you are using. Again, procedures can be manipulated, or a spoiler candidate inserted.
So it’s very hard to find fundamental objections to these properties, but your question is interesting because an impossibility theorem is simply the recognition that certain requirements are logically contradictory. I can set up any number of logical requirements that are completely contradictory, but that’s not interesting. What makes this interesting in Arrow’s case is that Arrow’s requirements seem to us so intuitively right. It’s hard to say, for example, let’s drop non-dictatorship, a condition that is crucial to a democracy. If you drop any of the other conditions, there is a cost, too. For example, if we limit universal domain, it implies a minimal extent of agreement within the group. And you have to ask yourself: how satisfied are we with being able to reach consistent decisions only when the group shares preferences that are sufficiently similar?
ST: The economist Thomas Piketty writes that “information pertinent to individual decisions never exists in concentrated or integrated form… and the objective of political institutions is to allow for an efficient use of these dispersed bits of information.” Is this why a fully representative political system is important?
AC: Piketty is talking about another aspect of voting, which is voting as an aggregation of preferences. Say we all agree that we want to elect the most competent candidate, but we disagree about who the most competent candidate is. Now, each of us has some relevant information, which is independent to some extent, about who we think the better candidate is. Voting is the way in which this information will be aggregated.
There is a beautiful result in voting theory, which comes from Condorcet, who imagines a democracy with a large number of people deciding on a binary decision (Condorcet’s insight is usually described with the example of a jury deciding if a defendant is guilty or not.) Everybody receives an independent signal as to what the correct decision is, and wants to make the right decision. Just imagine that it is slightly more likely for each individual to have the right information rather than the wrong information. As long as the information is not fully random, and you have majority voting, a large population will choose the right decision with probability that approaches one as the size of the population becomes very large. So you have a group that is barely informed, and yet it makes the right decision. This is much better than having 10 experts who know the correct outcome most of the time. This is an efficiency-based argument for democracy.
ST: Do you think that the imperfections in modern political systems are a reflection of the impossibilities presented in Arrow’s Theorem?
AC: I think that voting systems reflect other imperatives too, for example, the need to protect minority preferences. There are environments in which a candidate exists who is preferred to any other by a majority of voters. Even then, it’s not always obvious that you want to absolutely and always match the preferences of the majority.
I think aggregating collective preferences is extremely complex. Arrow’s Theorem reminds us of that, but there are other goals that you might want to impose on a voting system: for example, giving weight to intensities, linking outcomes to the protection of minorities. Think of religious or ethnic minorities, for example the history of African-Americans in this country. If the minority has preferences that are systematically different from the preferences of themajority group, how are you going to guarantee the minority its voice?
There has been the design of majority-minority districts, where the racial minority is a majority within the district. And there are voting systems that try to achieve protections through the design of the voting rule itself. For example, there is an interesting voting system called cumulative voting. With cumulative voting, imagine that you have to elect a committee, for example, or a city council, and you have five open positions. Each voter has five votes, and can cast as many votes as desiredfor an individual representative. So instead of giving one vote to the five people that you prefer, you could potentially decide to give all five votes to a particular candidate: the votes can be divided any way you would like.
Cumulative voting has been used for the protection of minorities. It has been mandated by courts in areas in which African-Americans are so dispersed that they can never exercise their joint will if it contradicts the majority. There are specific examples, such as the case of Chilton County, Alabama, where an agency responsible for paving roads had never had any African-American member on its board. The roads to African-American neighborhoods were not paved in Chilton County, which lead the courts to intervene and impose cumulative voting for the election of board members in the agency. This news of the new system lead to widespread unhappiness and confusion, but the African-American community saw immediately that it had to concentrate its votes, and it elected a representative to the agency’s board. Five years later, the African-American neighborhoods’ roads were paved. This is an example in which it’s clear that voting systems can have other concerns, in addition to the conditions of Arrow’s Theorem.
Voting rules are not perfect, and there will not be a perfect system, even when one’s objectives are quite simple. However, voting rules are quite flexible and so you can try to design them to achieve specific goals that you have in specific environments.
ST: What are your thoughts on the presidential election results? Many Americans are expressing shock and unhappiness with the outcome, and especially with the Electoral College.
AC: It’s true that the Electoral College can contradict the popular vote, as in this election. But the difference in the popular vote is not so large as to indicate a fundamental violation of popular will, and there's a reason for the Electoral College: the federal nature of the American constitution. The Constitution is designed so as to give states representation that is not only linked to their size. Again, this comes back to constraining completely unchecked majority rule.
If you look at the recent economic numbers, it is difficult to see voters’ anger. The economy has been growing, unemployment is down, and inequality is falling, so there’s nothing at the macro level that explains why there would be such an explosive political result. The income of white males has been stagnating, but it has been stagnating since the 1960s—why is there such a revolt now? It is hard to understand the voters’ unhappiness with the status quo from the numbers alone.
At the same time, the statements made by Trump in this campaign were unacceptable on a human basis, and so it’s hard to understand how people may have minimized the importance of such pronouncements.
Some of my colleagues have proposed that the election results are a delayed repercussion of the 2008–09 financial crisis, not so much because of the economic effects, which were very severe at the time but are now less so, but because the financial crisis revealed the deep incompetence of the elite. Nobody was punished, and there was no change.
Cases in whicha candidate won the popular vote without winning the Electoral College are few, but noticeably the only two cases after the 19th century are the cases of Al Gore and Hillary Clinton (both Democratic candidates in very recent elections). One possible explanation is the Electoral College. There is more concentration of people in the cities, and dense cities tend to vote Democratic. If cities are predominantly in states whose weight in the Electoral College is increasingly underrepresented relative to population, this may explain the two recent democratic losses in the College despite popular victories. It would be interesting to look at the number of Electoral College votes per capita historically, by state.
We started off our conversation with me defending the Electoral College. It actually depends on your definition of “fair.” If you just think the popular vote should determine the outcome, then there should be no Electoral College and the cities should count more, which may be right since there are more people in the cities. If you think that there is serious meaning to the federal nature of the United States and that it should be protected, then rural states should count even if their populations are relatively sparse. It comes back, also, to what we were saying, that there are goals for the voting system which are not only to find the majority’s preference over two candidates.
ST: Can we realistically compromise a certain number of Arrow’s conditions and obtain a semi-perfect voting procedure? Have scholars proposed any?
AC: Not really, because I think that the design of the voting systems really depends on what you want to achieve. Let’s take a voting system by Michel Balinski and Rida Laraki, called majority judgment. In this procedure, voters are given a number of alternative candidates and must assign them to one of six categories: say, very good, good, fair, etc. Then, you aggregate these votes for categories as if they were numbers and you choose the median. Voters are free to assign any number of alternatives to the categories: for example, voters could hypothetically assign all the candidates to the ‘unacceptable’ category. Balinski and Laraki claim is that this system is more representative and more robust than other voting systems.
Another voting system with strong advocates is approval voting, in which people cast a vote for all the candidates they “approve of”, and only for them. The winner is the candidate with the largest number of approval votes.
These types of voting procedures have superior properties specific togiven environments. However, Arrow’s theorem tells us that with at least three alternative candidates the ideal voting procedure does not exist in all possible environments.
Majority voting does really well, in the case of two alternative candidates. The systems we have talked about all come down to majority voting, and they can satisfy all of Arrow’s conditions in the case of two candidates. However, majority voting will not respect, for example, the federal nature of the United States, so you put in the system of Electoral College.
ST: Dasgupta and Maskin also write in their paper on fair voting that the “way most countries pick their Presidents is faulty.” The authors propose a system known as ‘true majority’, as the least imperfect voting system out there. Do you think it to be pragmatic to overhaul age-old democratic institutions in favor of new ones?
AC: True majority has a problem: there may not exist a winner, if the transitivity of choices is violated. I don’t believe that Dasgupta and Maskin give us a proper solution in that case. When they wrote their paper, they were very surprised by the extent of readers’ responses, some of which were very aggressive. People have very strong attachments to their voting system, which is interesting because voting procedures are so technical and mathematical.
So, changing a voting system is a very difficult thing to do. Voting is rooted in tradition, and people are very worried about not understanding the full implications of a change in their voting procedure. It’s a very fair concern.
When I first started working on Storable Votes, I was advised that it is important to understand the worst things that can happen with new voting systems, or how badly it can do relative to existing institutions. I thought that was a wise comment, because we look at our models “in equilibrium”, supposing that every agent is acting optimally for herself, but it is also important to analyze the worst-case scenarios before doing any major reform.
ST: Your work with Columbia University student elections in your paper “Protecting Minorities in Binary Elections: A test of storable votes using field data.” The paper shows that voters often have difficulty understanding even simple instructions, and this can result in inefficiencies or irrational individual decisions. Won’t the large-scale overhaul of inefficient democratic systems in favor of new ones lead to similar inefficiencies?
AC: Systems can indeed be too sophisticated and require too much strategic thought and calculations. But for simple instructions, like indicating rankings, a lot of it is just habit and training. If you change the voting system, you’d have to train people. That said, since I said that I don’t believe I know any ideal voting system, I’m not advocating a reform in voting systems for electoral purposes, at least.
ST: In the traditional majority voting system, the votes of the majority determine the winner, thus overriding minority voices even if they might feel more strongly than the majority does. In the same paper as above, you argue that voting procedures should allow the intensity of voter preferences to factor into decisions, and you propose an alternative system of storable votes. What is a storable vote, and how would the system work to protect minority rights?
AC: The real issue is when one minority with specific preferences is consistently in opposition to the majority preferences. If it’s always the same minority group being silenced, then majority voting can be both very unfair and dangerous, because it disenfranchises the minority, and you have a group that is never represented. You don’t want that for both ethical and pragmatic reasons. Although you might have satisfied all of Arrow’s conditions, you haven’t really solved all of the problems.
Storable votes is a voting system that applies to multiple binary choices. Imagine that you are voting on an agenda with five proposals, each of which may pass or fail. Instead of being required to cast one vote on each proposal, you are allowed to distribute your five votes any way you would like. So, for example, if you are indifferent to the first proposal, you can abstain, and can decide to give all five of your votes to the second proposal. The decision is taken by the majority of votes cast, which means that a minority that cares more andcasts more votes for a particular issue can win. But the minority wins only when it cares particularly about an issue and, at the same time, the majority does not: basically, the minority gets its preferred outcome when it should, from an efficiency point of view.
Storable votes is similar in the idea of allowing accumulation, to cumulative voting, but mathematically it’s actually very different, because cumulative voting is imposed in situations where candidates compete against each other, but with storable votes, there is no competition across issues, and each individual issue is a binary issue that is only competing with its defeat.
ST: Finally, it seems that scholarly work on majority voting still revolves around the works of Arrow and Condorcet, both written before the start of the twenty-first century. Do you see any future possibility of more optimal voting procedures or have we reached a standstill?
AC: It’s not true that there is no recent work in this field. There is Storable Votes, for example! Another voting system proposed recently, if you want another example, is quadratic voting. Quadratic voting is another way of giving weight to intensity in a single binary decision. In this system, candidates buy votes, but the number of votes they buy is a square root of what they pay. If a candidate wants ten votes, he pays a 100 dollars, and so on. All the revenues from the money paid are distributed between everyone equally. When the number of voters is very large, quadratic voting has a number of desirable properties.
However, this system has a couple of problems. First, it uses money, and thus works to the extent that no one is budget-constrained. But this is not the way the world works, and I worry about putting money and voting together. Second, this system really invites collusion, as someone wanting ten votes would be better off recruiting individuals to each get a single vote more rather than buying ten votes myself (because that costs me a 100 dollars, as opposed to 10). Personally, I don’t see this system as a solution to all the ills of the world, but it is another example of the work that continues to be done on voting systems. There is a lot more to be done. I don’t believe it will lead us to the optimal voting system, which we know doesn’t exist. But for specific questions, we might be able to design something with desirable properties.