Voting Problems and Computer Science Applications

Voting Problems and Computer Science Applications Fred Roberts, Rutgers University

What do Mathematics and Computer Science have to do with Voting?

Have you used Google lately?

Have you used Google lately? Did you know that Google has something to do with voting?

Have you tried buying a book on online lately?

Have you tried buying a book on online lately? • Did you get a message saying: If you are interested in this book, you might want to look at the following books as well? Did you know that has something to do with voting?

Have you ever heard of v-sis?

Have you ever heard of v-sis? • It’s a cancer-causing gene. • Computer scientists helped discover how it works? • How did they do it? • The answer also has something to do with voting. Cancer cell

Some connections between Computer Science and Voting are clearly visible. • Some people are working on plans to allow us to vote from home – over the Internet.

Electronic Voting • Security Risks in Electronic Voting • Could someone put on a “denial of service attack?” • That is, could someone flood your computer and those of other likely voters with so much spam that you couldn’t succeed in voting?

Electronic Voting • Security Risks in Electronic Voting • How can we prevent random loss of connectivity that would prevent you from voting? • How can your vote be kept private? • How can you be sure your vote is counted? • What will prevent you from selling your vote to someone else?

Electronic Voting • Security Risks in Electronic Voting • These are all issues in modern computer science research. • However, they are not what I want to talk about. • I want to talk about how ideas about voting systems can solve problems of computer science.

How do Elections Work? • Typically, everyone votes for their first choice candidate. • The votes are counted. • The person with the most votes wins. • Or, sometimes, if no one has more than half the votes, there is a runoff.

But do we necessarily get the best candidate that way?

Sometimes Having More Information about Voters’ Preferences is Very Helpful • Sometimes it is helpful to have voters rank order all the candidates • From their top choice to their bottom choice.

Rankings Dennis Kucinich Bill Richardson John Edwards Ties are allowed Hillary Clinton Barack Obama

Rankings • What if we have four voters and they give us the following rankings? Who should win? • Voter 1Voter 2Voter 3Voter 4 • Clinton Clinton Obama Obama • Richardson Kucinich Edwards Richardson • Edwards Edwards Richardson Kucinich • Kucinich Richardson Kucinich Edwards • Obama Obama Clinton Clinton

Rankings • What if we have four voters and they give us the following rankings? • There is one added candidate. • Who should win? • Voter 1Voter 2Voter 3Voter 4 • Clinton Clinton Obama Obama • Gore Gore Gore Gore • Richardson Kucinich Edwards Richardson • Edwards Edwards Richardson Kucinich • Kucinich Richardson Kucinich Edwards • Obama Obama Clinton Clinton

Rankings • Voter 1Voter 2Voter 3Voter 4 • Clinton Clinton Obama Obama • Gore Gore Gore Gore • Richardson Kucinich Edwards Richardson • Edwards Edwards Richardson Kucinich • Kucinich Richardson Kucinich Edwards • Obama Obama Clinton Clinton • Maybe someone who is everyone’s second choice is the best choice for winner. • Point: We can learn something from ranking candidates.

Consensus Rankings • How should we reach a decision in an election if every voter ranks the candidates? • What decision do we want? • A winner • A ranking of all the candidates that is in some sense a consensus ranking • This would be useful in some applications • Job candidates are ranked by each interviewer • Consensus ranking of candidates • Make offers in order of ranking • How do we find a consensus ranking?

Consensus Rankings These two rankings are very close: Clinton Obama Obama Clinton Edwards Edwards Kucinich Kucinich Richardson Richardson

Consensus Rankings These two rankings are very far apart: Clinton Obama Richardson Kucinich Edwards Edwards Kucinich Richardson Obama Clinton

Consensus Rankings • This suggests we may be able to make precise how far apart two rankings are. • How do we measure the distance between two rankings?

Consensus Rankings • Kemeny-Snell distance between rankings: twice the number of pairs of candidates i and j for which i is ranked above j in one ranking and below j in the other + the number of pairs that are ranked in one ranking and tied in another. • ab • x y-z • y x • z • On {x,y}: +2 • On {x,z}: +2 • On {y,z}: +1 • d(a,b) = 5.

Consensus Rankings • One well-known consensus method: • “Kemeny-Snell medians”: Given set • of rankings, find ranking minimizing • sum of distances to other rankings. • Kemeny-Snell medians are having • surprising new applications in CS. John Kemeny, pioneer in time sharing in CS

Consensus Rankings • Kemeny-Snell median: Given rankings a1, a2, …, ap, find a ranking x so that • d(a1,x) + d(a2,x) + … + d(ap,x) • is as small as possible. • x can be a ranking other than a1, a2, …, ap. • Sometimes just called Kemeny median.

Consensus Rankings • a1a2a3 • Fish Fish Chicken • Chicken Chicken Fish • Beef Beef Beef • Median = a1. • If x = a1: • d(a1,x) + d(a2,x) + d(a3,x) = 0 + 0 + 2 = 2 • is minimized. • If x = a3, the sum is 4. • For any other x, the sum is at least 1 + 1 + 1 = 3.

Consensus Rankings • a1a2a3 • Fish Chicken Beef • Chicken Beef Fish • Beef Fish Chicken • Three medians = a1, a2, a3. • This is the “voter’s paradox” situation.

Consensus Rankings • a1a2a3 • Fish Chicken Beef • Chicken Beef Fish • Beef Fish Chicken • Note that sometimes we wish to minimize • d(a1,x)2 + d(a2,x)2 + … + d(ap,x)2 • A ranking x that minimizes this is called a Kemeny-Snell mean. • In this example, there is one mean: the ranking declaring all three alternatives tied.

Consensus Rankings • a1a2a3 • Fish Chicken Beef • Chicken Beef Fish • Beef Fish Chicken • If x is the ranking declaring Fish, Chicken • and Beef tied, then • d(a1,x)2 + d(a2,x)2 + … + d(ap,x)2 = • 32 + 32 + 32 = 27. • Not hard to show this is minimum.

Consensus Rankings • Theorem (Bartholdi, Tovey, and Trick, 1989; Wakabayashi, 1986): Computing the Kemeny-Snell median of a set of rankings is an NP-complete problem.

Consensus Rankings • Okay, so what does this have to do with practical computer science questions?

Consensus Rankings • I mean reallypractical computer science questions.

Google Example • Google is a “search engine” • It searches through web pages and rank orders them. • That is, it gives us a ranking of web pages from most relevant to our query to least relevant.

Meta-search • There are other search engines besides Google. • Wouldn’t it be helpful to use several of them and combine the results? • This is meta-search. • It is a voting problem • Combine page rankings from several search engines to produce one consensus ranking • Dwork, Kumar, Naor, Sivakumar (2000): Kemeny-Snell medians good in spam resistance in meta-search (spam by a page if it causes meta-search to rank it too highly) • Approximation methods make this computationally tractable

Collaborative Filtering • Recommending books or movies • Combine book or movie ratings by various people • This too is voting • Produce a consensus ordered list of books or movies to recommend • Freund, Iyer, Schapire, Singer (2003): “Boosting” algorithm for combining rankings. • Related topic: Recommender Systems

Meta-search and Collaborative Filtering • A major difference from the election situation • In elections, the number of voters is large, number of candidates is small. • In CS applications, number of voters (search engines) is small, number of candidates (pages) is large. • This makes for major new complications and research challenges.

Have you ever heard of v-sis? • It’s a cancer-causing gene. • Computer scientists helped discover how it works? • How did they do it? • The answer also has something to do with voting.

Large Databases and Inference • Decision makers consult massive data sets. • The study of large databases and gathering of information from them is a major topic in modern computer science. • We will give an example from the field of Bioinformatics. • This lies at the interface between Computer Science and Molecular Biology

Large Databases and Inference • Real biological data often in form of sequences. • GenBank has over 7 million sequences comprising 8.6 billion “bases.” • The search for similarity or patterns has extended from pairs of sequences to finding patterns that appear in common in a large number of sequences or throughout the database: “consensus sequences” • Emerging field of “Bioconsensus”: applies consensus methods to biological databases.

Large Databases and Inference Why look for such patterns? Similarities between sequences or parts of sequences lead to the discovery of shared phenomena. For example, it was discovered that the sequence for platelet derived factor, which causes growth in the body, is 87% identical to the sequence for v-sis, that cancer-causing gene. This led to the discovery that v-sis works by stimulating growth.

Large Databases and Inference DNA Sequences A DNA sequence is a sequence of “bases”: A = Adenine, G = Guanine, C = Cytosine, T = Thymine Example: ACTCCCTATAATGCGCCA

Large Databases and Inference Example Bacterial Promoter Sequences studied by Waterman (1989): RRNABP1: ACTCCCTATAATGCGCCA TNAA: GAGTGTAATAATGTAGCC UVRBP2: TTATCCAGTATAATTTGT SFC: AAGCGGTGTTATAATGCC Notice that if we are looking for patterns of length 4, each sequence has the pattern TAAT.

Large Databases and Inference Example However, suppose that we add another sequence: M1 RNA: AACCCTCTATACTGCGCG The pattern TAAT does not appear here. However, it almost appears, since the pattern TACT appears, and this has only one mismatch from the pattern TAAT.

Voting Problems and Computer Science Applications

Voting Problems and Computer Science Applications

Presentation Transcript

Some Problems in Computer Science and Elementary Number Theory

CS1100: Computer Science and Its Applications

Computer and Technology Applications

CS1100: Computer Science and Its Applications

CS1100: Computer Science and Its Applications

Unsolved Computer Science Problems in Distributed Computing

Problems with FPTP voting

Computer Science 654 Lecture 7: Electronic Voting Security Issues

CS1100: Computer Science and Its Applications

Computer Science and Computational Science

Applications of Computer Science

Weighted Voting Problems

Computer Haptics and Applications

CS1100: Computer Science and Its Applications

CS1100: Computer Science and Its Applications

Data Streams and Applications in Computer Science

Hard problems in computer science

Computer problems and solutions

CS1100: Computer Science and Its Applications

Voting Problems and Computer Science Applications

Computer Literacy and Applications

CS1100: Computer Science and Its Applications