Social Choice and Computer Science

Social Choice and Computer Science Fred Roberts, Rutgers University

Social Choice • How do societies (or groups) reach good decisions? • The theory of social choice deals • with this question. • They argue. • They find a dictator. • They vote.

What do Mathematics and Computer Science have to do with Voting?

Have you used Google lately?

Have you used Google lately? Did you know that Google has something to do with voting?

Have you tried buying a book on online lately?

Have you tried buying a book on online lately? • Did you get a message saying: If you are interested in this book, you might want to look at the following books as well? Did you know that has something to do with voting?

Have you ever heard of v-sis?

Have you ever heard of v-sis? • It’s a cancer-causing gene. • Computer scientists helped discover how it works. • How did they do it? • The answer also has something to do with voting. Cancer cell

Computer Science and the Social Sciences • Many recent applications in CS involve issues/problems of long interest to social scientists: • preference, utility • conflict and cooperation • allocation • incentives • measurement • social choice orconsensus • Methods developed in SS beginning to be used in CS

CS and SS • CS applications place great strain on SS methods • Sheer size of problems addressed • Computational power of agents an issue • Limitations on information possessed by players • Sequential nature of repeated applications • Thus: Need for new generation of SS methods • Also: These new methods will provide powerful tools to social scientists

Social Choice and CS: Outline • Consensus Rankings • Meta-search and Collaborative Filtering • Large Databases and Inference • Computational Intractability of Consensus Functions • Electronic Voting • Software and Hardware Measurement • Power of a Voter

How do Elections Work? • Typically, everyone votes for their first choice candidate. • The votes are counted. • The person with the most votes wins. • Or, sometimes, if no one has more than half the votes, there is a runoff.

But do we necessarily get the best candidate that way? Let’s look back at the 2008 Democratic primaries.

Sometimes Having More Information about Voters’ Preferences is Very Helpful • Sometimes it is helpful to have voters rank order all the candidates • From their top choice to their bottom choice.

Rankings Dennis Kucinich Bill Richardson John Edwards Ties are allowed Hillary Clinton Barack Obama

Rankings • What if we have four voters and they give us the following rankings? Who should win? • Voter 1Voter 2Voter 3Voter 4 • Clinton Clinton Obama Obama • Richardson Kucinich Edwards Richardson • Edwards Edwards Richardson Kucinich • Kucinich Richardson Kucinich Edwards • Obama Obama Clinton Clinton

Rankings • What if we have four voters and they give us the following rankings? • There is one added candidate. • Who should win? • Voter 1Voter 2Voter 3Voter 4 • Clinton Clinton Obama Obama • Gore Gore Gore Gore • Richardson Kucinich Edwards Richardson • Edwards Edwards Richardson Kucinich • Kucinich Richardson Kucinich Edwards • Obama Obama Clinton Clinton

Rankings • Voter 1Voter 2Voter 3Voter 4 • Clinton Clinton Obama Obama • Gore Gore Gore Gore • Richardson Kucinich Edwards Richardson • Edwards Edwards Richardson Kucinich • Kucinich Richardson Kucinich Edwards • Obama Obama Clinton Clinton • Maybe someone who is everyone’s second choice is the best choice for winner. • Point: We can learn something from ranking candidates.

Consensus Rankings • How should we reach a decision in an election if every voter ranks the candidates? • What decision do we want? • A winner • A ranking of all the candidates that is in some sense a consensus ranking • This would be useful in some applications • Job candidates are ranked by each interviewer • Consensus ranking of candidates • Make offers in order of ranking • How do we find a consensus ranking?

Consensus Rankings • Background: Arrow’s Impossibility Theorem: • There is no “consensus method” that satisfies certain reasonable axioms about how societies should reach decisions. • Input to Arrow’s Theorem: rankings • of alternatives (ties allowed). • Output: consensus ranking. Kenneth Arrow Nobel prize winner

Consensus Rankings • There are widely studied and widely used consensus methods that violate one or • more of Arrow’s conditions. • One well-known consensus method: • “Kemeny-Snell medians”: Given set • of rankings, find ranking minimizing • sum of distances to other rankings. • Kemeny-Snell medians are having • surprising new applications in CS. John Kemeny, pioneer in time sharing in CS

Consensus Rankings These two rankings are very close: Clinton Obama Obama Clinton Edwards Edwards Kucinich Kucinich Richardson Richardson

Consensus Rankings These two rankings are very far apart: Clinton Obama Richardson Kucinich Edwards Edwards Kucinich Richardson Obama Clinton

Consensus Rankings • This suggests we may be able to make precise how far apart two rankings are. • How do we measure the distance between two rankings?

Consensus Rankings • Kemeny-Snell distance between rankings: twice the number of pairs of candidates i and j for which i is ranked above j in one ranking and below j in the other + the number of pairs that are ranked in one ranking and tied in another. • ab • x y-z • y x • z • On {x,y}: +2 • On {x,z}: +2 • On {y,z}: +1 • d(a,b) = 5.

Consensus Rankings • Kemeny-Snell median: Given rankings a1, a2, …, ap, find a ranking x so that • d(a1,x) + d(a2,x) + … + d(ap,x) • is minimized. • x can be a ranking other than a1, a2, …, ap. • Sometimes just called Kemeny median.

Consensus Rankings • a1a2a3 • Fish Fish Chicken • Chicken Chicken Fish • Beef Beef Beef • Median = a1. • If x = a1: • d(a1,x) + d(a2,x) + d(a3,x) = 0 + 0 + 2 = 2 • is minimized. • If x = a3, the sum is 4. • For any other x, the sum is at least 1 + 1 + 1 = 3.

Consensus Rankings • a1a2a3 • Fish Chicken Beef • Chicken Beef Fish • Beef Fish Chicken • Three medians = a1, a2, a3. • This is the “voter’s paradox” situation.

Consensus Rankings • a1a2a3 • Fish Chicken Beef • Chicken Beef Fish • Beef Fish Chicken • Note that sometimes we wish to minimize • d(a1,x)2 + d(a2,x)2 + … + d(ap,x)2 • A ranking x that minimizes this is called a Kemeny-Snell mean. • In this example, there is one mean: the ranking declaring all three alternatives tied.

Consensus Rankings • a1a2a3 • Fish Chicken Beef • Chicken Beef Fish • Beef Fish Chicken • If x is the ranking declaring Fish, Chicken • and Beef tied, then • d(a1,x)2 + d(a2,x)2 + … + d(ap,x)2 = • 32 + 32 + 32 = 27. • Not hard to show this is minimum.

Consensus Rankings • Theorem (Bartholdi, Tovey, and Trick, 1989; Wakabayashi, 1986): Computing the Kemeny-Snell median of a set of rankings is an NP-complete problem.

Consensus Rankings • Okay, so what does this have to do with practical computer science questions?

Consensus Rankings • I mean reallypractical computer science questions.

Google Example • Google is a “search engine” • It searches through web pages and rank orders them. • That is, it gives us a ranking of web pages from most relevant to our query to least relevant.

Meta-search • There are other search engines besides Google. • Wouldn’t it be helpful to use several of them and combine the results? • This is meta-search. • It is a voting problem • Combine page rankings from several search engines to produce one consensus ranking • Dwork, Kumar, Naor, Sivakumar (2000): Kemeny-Snell medians good in spam resistance in meta-search (spam by a page if it causes meta-search to rank it too highly) • Approximation methods make this computationally tractable

Collaborative Filtering • Recommending books or movies • Combine book or movie ratings by various people • This too is voting • Produce a consensus ordered list of books or movies to recommend • Freund, Iyer, Schapire, Singer (2003): “Boosting” algorithm for combining rankings. • Related topic: Recommender Systems

Meta-search and Collaborative Filtering • A major difference from the election situation • In elections, the number of voters is large, number of candidates is small. • In CS applications, number of voters (search engines) is small, number of candidates (pages) is large. • This makes for major new complications and research challenges.

Have you ever heard of v-sis? • It’s a cancer-causing gene. • Computer scientists helped discover how it works. • How did they do it? • The answer also has something to do with voting.

Large Databases and Inference • Decision makers consult massive data sets. • The study of large databases and gathering of information from them is a major topic in modern computer science. • We will give an example from the field of Bioinformatics. • This lies at the interface between Computer Science and Molecular Biology

Large Databases and Inference • Real data often in form of sequences • Here, concentrate on bioinformatics • GenBank has over 7 million sequences comprising 8.6 billion bases. • The search for similarity or patterns has extended from pairs of sequences to finding patterns that appear in common in a large number of sequences or throughout the database: “consensus sequences”. • Emerging field of “Bioconsensus”: applies SS consensus methods to biological databases.

Large Databases and Inference Why look for such patterns? Similarities between sequences or parts of sequences lead to the discovery of shared phenomena. For example, it was discovered that the sequence for platelet derived factor, which causes growth in the body, is 87% identical to the sequence for v-sis, a cancer-causing gene. This led to the discovery that v-sis works by stimulating growth.

Social Choice and Computer Science