Sorting and Selection with Imprecise Comparisons

Sorting and Selection with Imprecise Comparisons MiklósAjtai, Vitaly Feldman, Avinatan Hassidim, Jelani Nelson Presented by Dan Garber

Introduction • Sorting \ max-finding algorithms are based on performing comparisons between pairs of elements. • Given two elements to compare, we assume that we can tell which is of greater value. • In some scenarios it is not always possible to assume the above because we don’t always know the “real” values of the elements.

Introduction • Which car do we prefer to buy?

Introduction • How to decide which sports team is champion? • Is team A better than team B? • Is team D better than team B? D A D A B C D

Model • We are given a group of n elements; each of them is associated with an unknown “true” numerical value. • Given two elements: • Corollary: its not possible to find the exact maximum or the correct permutation.

Model • The error of a max-finding algorithm A which outputs xis k if: • The error of a sorting algorithm A which outputs a permutation π is k if:

Goal • Explore the tradeoff between the error of an algorithm and the number of comparisons. • An algorithm that performs all possible comparisons “knows everything” and can minimize the error. • Can we find algorithms that can achieve the same error bound with less comparisons?

Motivation • Experimental Psychology & Sociology • Ranking of elements by human subjects. • Marketing Research • Information Retrieval • Training algorithms using human evaluators. • Designing Sports tournaments • Minimize error while reducing the number of games required.

Examples • Max finding: • Sorting (bubble sort): max = array[0]; for (i=1; i < 10; i++) if (array[i] > max) max = array[i]; do swapped = false; for (i=0; i < 9; i++) if (array[i] > array[i+1]) { swap(array[i], array[i+1]); swapped = true; } while swapped

Agenda • Lower error bounds • Maximum finding • Error 2 algorithm • Error k algorithm • Sorting • Sorting with error 2 • Selection with error k • Sorting with error k • Lower bounds

Lower error bounds • Theorem 1. sorting according to the number of wins in a round-robin tournament yields error 2. • Proof. • Let x,y such that: val(y)+2 < val(x). • For any z: y defeats z  x defeats z. • x defeats y. • x has strictly more wins than y.

Lower error bounds • Theorem 2. no deterministic max-finding algorithm has error less than 2. • Proof. • Assume three elements: a,b,c. • The comparator can claim: a>b>c>a. • w.l.o.g assume the algorithm outputs a as max. • The values of a,b,c could be 0,1,2.

Max finding with error 2 • Algorithm A₂(s): • Label all elements as candidates. • while there are more than s candidate elements: • Pick an arbitrary set of s candidate elements and play them in a RR tournament. Let x have the most number of wins. • Compare x against all candidate elements and eliminate all elements that lose to x. • Play the final (at most s) candidate elements in a RR tournament and return the element with the most wins.

Max finding with error 2 • Lemma 1.A₂ has error 2 and makes at most ns+(n^2)/s comparisons. With s=sqrt(n) we get at most 2n^(3/2) comparisons. • Proof. • If x* is never eliminated  x* participates in Step 3. Theorem 1 ensures the error. • If x* was eliminated, it was by an element x s.t. x*-x<=1 any element with value less then X*-2 was also eliminated in this iteration. • Comparisons bound: In each iteration at least (s-1)/2 elements are eliminated.

Max finding with error k • k-max-set is a set of elements that contains an element x such that x*-x≤k. • Lemma 2. the following algorithm performs a RR tournament and outputs a 1-max-set of size at most log(n). • After performing RR, the algorithm greedily picks an element which defeats as many thus-far undefeated elements as possible.

Max finding with error k • Algorithm1-Cover: • RunA₂(s) with s=sqrt(n)/8. • Return the union of the x that were chosen in any iteration of Step 2(a), in addition to the output of Lemma 2 on the elements in the final tournament in Step 3. • Lemma 3.1-Cover finds a 1-max-set of size at most sqrt(n)/4 using O(n^(3/2)) comparisons.

Max finding with error k • Algorithm - Returns a k-max for k≥3 • return • Algorithm -Returns a (k-1)-max set of size for k ≥ 2 • if k=2 return • else • Equipartition the n elements into t(n,k) sets. • Recursively call on each set to recover (k-2)-max–set . • Return the output of 1-COVER with as input.

Max finding with error k • k=5 • k=4 • k=3 • k=2

Max finding with error k • Theorem 3.For every 3≤k ≤ loglogn , the algorithm finds a k-max element using comparisons. • Corollary. There exists a max-finding algorithm using O(n) comparisons with error loglogn.

Sorting with error 2 • Lemma 5.In a RR tournament on n elements, the element with the median number of wins has at least (n-2)/4 wins and at least (n-2)/4 losses.

Sorting with error 2 • Algorithm B₂: • Modify A₂ so that the x found in Step 2(a) is a pivot in the sense of Lemma 5. • Compare x against all elements and pivot into two sets. • Recursively sort each of the two sets. • Lemma 6.Algorithm B₂ sorts with error 2 and requires at most O(n^(3/2)) comparisons. • Error bound - trivial

Sorting with error 2 • Analysis: • Every recursive call contains at least elements  at most iterations. • In each iteration at most comparisons to find a median. • Pivoting in each iteration: at most n comparisons. • Sorting the base step: at most comparisons.

Selection with error k • Defenition 1.Element in a set of n elements is of k-ordreri if there exists a partition S₁, S₂ of [n] such that: • A k-median is an element of k-order floor(n/2).

Selection with error k • Lemma 7. There exists a deterministic algorithm such that for any i in [n] and 2 ≤ k ≤ loglogn, the algorithm finds an element of k-order i in comparisons.

Selection with error k • Algorithm - Returns an element of k-order i • If k≤3, sort using B₂ and return the element with index i. • Equipartition the elements into sets • Recursively call on each set to get a (k-2)-median • Play the in a RR tournament and let y be the element with the median number of wins. • Partition the elements according to y into X₁,X₂ • If | X₁| = i-1 return i. • Else if i≤| X₁| recursively find a k-order i in X₁. • Else recursively find a k-order (i- | X₁| -1) in X₂.

Sorting with error k • Theorem 5. For any 2 ≤ k ≤ 2loglogn, there exists a deterministic sorting algorithm with error k using comparisons. • Algorithm: • Find an element x that is a k-median. • Equipartition the elements into sets S₁,S₂ such that every element in S₂ is k-greater than every element in S₁∪{x}. • Recursively sort each partition.

Lower Bounds • Theorem 6.Every deterministic max-finding algorithm with error k requires comparisons. • We saw an algorithm with • Theorem 7.Every deterministic algorithm which k-sorts n elements requires comparisons. • We saw an algorithm with

Sorting and Selection with Imprecise Comparisons

Sorting and Selection with Imprecise Comparisons

Presentation Transcript

Sorting Algorithms: Selection, Insertion and Bubble

Advanced Data Handling - Grids, Validation, Selection, and Sorting

Supporting Queries with Imprecise Constraints

Precise and Imprecise Thinking

Computational Geometry with imprecise data

V. Comparisons with Symptoms

Sorting Techniques Selection Sort Bubble Sort

FD event selection and data/MC comparisons

Sorting and Selection

Sorting and selection – Part 2

Sorting and Selection, Part 1

Approximate Selection Queries over Imprecise Data

Deterministic Selection and Sorting

Sorting Algorithms and Average Case Number of Comparisons

Lecture 3 Sorting and Selection

On ‘Selection and Sorting with Limited Storage’

Sorting – Insertion and Selection

Selection Sorting

Deterministic Selection and Sorting

Randomized Algorithms for Selection and Sorting

Lecture 3 Sorting and Selection

CS 1114: Sorting and selection (part two)