610 likes | 758 Views
IOI 2005 Training. Dr Kan Min-Yen. Topics and outline. Sorting Computer Arithmetic and Algebra Invariants and Number Theory. Sorting. Problems and Parameters Comparison-based sorting Non-comparison-based. www.personaltouchmailing.com. What sort of problems?.
E N D
IOI 2005 Training Dr Kan Min-Yen
Topics and outline • Sorting • Computer Arithmetic and Algebra • Invariants and Number Theory
Sorting • Problems and Parameters • Comparison-based sorting • Non-comparison-based www.personaltouchmailing.com
What sort of problems? Sorting is usually not the end goal, but a prerequisite • (Efficient) searching! • Uniqueness / Duplicates • Prioritizing (c.f. priority queues) • Median / Selection • Frequency Counting • Set operations • Target Pair
Two properties of sorting • Stable • Items with the same value are ordered in the same way after the sort as they were before • Important for doing multiple stage sorts • E.g., sorting by First name, Last name • In-place: sorts the items without needing extra space • Space proportional to the size of the input n
Comparison-based sort • Based on comparing two items • Many variants, but not the only way to sort • Discuss only the important ones for programming contests Selection, Insertion, Merge and Quick
Comparison Sorts Comparison Sort Animation: http://math.hws.edu/TMCM/java/xSortLab/ • Selection • Algo: find min or max of unsorted portion • Heapsort: is selection sort with heap data structure • Remember: Minimize amount of swaps • Insertion • Algo: insert unsorted item in sorted array • Remember: Minimize amount of data
Comparison Sorts • Merge • Idea: divide and conquer, recursion • Algo: merge two sorted arrays in linear time • Remember: Don’t recurse to the base case if using this sort, not in-place • Quick • Idea: randomization, pivot • Algo: divide problem to smaller and larger half based on a pivot • Remember: Partition can be used to solve problems too! A problem with sorting as its core may not be best solved with generic tool. Think before using a panacea like Quicksort.
Miscellaneous sort Most sorting relies on comparisons between two items. • Proven to be Θ(n log n) Example: sort an array of distinct integers ranged 1-k We’ll go over • Radix sort • Counting sort
What is radix? • Radix is the same as base. In decimal system, radix = 10. • For example, the following is a decimal number with 4 digits 1st Digit 2nd Digit 3rd Digit 4th Digit
Radix Sort • Suppose we are given n d-digit decimal integers A[0..n-1], radix sort tries to do the following: for j = d to 1 { By ignoring digit-1 up to digit-(j-1), form the sorted array of the numbers }
Radix Sort (Example) Sorted Array if we only consider digit-4 Original Group using4th digit Ungroup
Radix Sort (Example) Sorted Array if we only consider digits-3 and 4 Original Group using3rd digit Ungroup
Radix Sort (Example) Sorted Array if we only consider digit-2, 3, and 4 Original Group using2nd digit Ungroup
Radix Sort (Example) Done! Original Group using1st digit Ungroup
Details on Radix Sort • A stable sorting algorithm • Done from least signficant to most signficant • Can be used with a higher base for better efficiency • Decide whether it’s really worth it • Works for integers, but not real, floating point • But see: http://codercorner.com/RadixSortRevisited.htm The combination of 1 and 2 can be used for combining sorts in general
Counting Sort • Works by counting the occurrences of each data value. • Assumes that there are n data items in the range of 1..k • The algorithm can then determine, for each input element, the amount of elements less than it. • For example if there are 9 elements less than element x, then x belongs in the 10th data position. • These notes are from Cardiff’s Christine Mumford: http://www.cs.cf.ac.uk/user/C.L.Mumford/
The first for loop initialises C[ ] to zero. The second for loop increments the values in C[], according to their frequencies in the data. The third for loop adds all previous values, making C[] contain a cumulative total. The fourth for loop writes out the sorted data into array B[]. countingsort(A[], B[], k) for i = 1 to k do C[i] = 0 for j = 1 to length(A) do C[A[j]] = C[A[j]] + 1 for 2 = 1 to k do C[i] = C[i] + C[i-1] for j = 1 to length(A) do B[C[A[j]]] = A[j] C[A[j]] = C[A[j]] - 1 Counting Sort
Counting Sort • Demo from Cardiff: http://www.cs.cf.ac.uk/user/C.L.Mumford/tristan/CountingSort.html
Counting and radix sort • What’s their complexity? • Radix sort: O(dn) = O(n), if d << n • Counting sort: 2k + 2n = O(n), if k << n • Why do they work so fast?? • No comparisons are made • Both are stable sorts, but not in-place • Can you fix them to be in-place? • When to use? • When you have lots of items in a fixed range
Quiz and discussion There are no right answers… • Which sort better used for a very large random array? • For sorting a deck of cards? • For sorting a list of first name, last names? • For sorting single English words • For sorting an almost sorted set?
Computer Arithmetic and Algebra • Big Numbers • Arbitrary precision arithmetic • Multiprecision arithmetic • Computer Algebra • Dealing with algebraic expressions a la Maple, Mathematica
Arithmetic • Want to do standard arithmetic operations on very large/small numbers • Can’t fit representation of numbers in standard data types • What to do? • Sassy answer: use java and BigInt class
Two representations • How to represent: 10 00000 00000 02003 • Linked list: 1e16 2e3 3e0 • Dense representation • Good for numbers with different or arbitrary widths • Where should the head of the linked list point to? • Array: <as above> • Sparse representation • Good for problems in the general case • Don’t forget to store the sign bit somewhere • Which base to choose: 10 or 32? • How to represent arbitrary large real numbers?
Standard operations Most large number operations rely on techniques that are the same as primary school techniques • Adding • Subtracting • Multiplying • Dividing • Exponentiation / Logarithms
Algorithms for big numbers • Addition of bigints x and y • Complicated part is to deal with the carry • What about adding lots of bigints together? • Solution: do all the adds first then worry about the carries
Canonicalization and adding • Canonicalizing • 12E2 + 34E0 => 1E3 + 2E2 + 3E1 + 4E0 • Strip coefficient of values larger than B and carry to next value • Reorder sparse representation if necessary • Adding • Iteratively add all Ai Bi … Zi then canonicalize • Hazard: • Data type for each entry must be able to hold maximum expected data or overflow will occur
Algorithms for big numbers • Subtraction • Like addition but requires borrowing (reverse carry) • Place the higher absolute magnitude number at the top • Applicable to addition of mixed sign numbers • Comparison • Start from higher order bits then work backwards
Multiplying • Given two big integers X and Y in canonical form: • the big integer Z = X*Y can be obtained thanks to the formula: • Notes: • This is the obvious way we were taught in primary school, the complexity is Θ(N2) • Canonicalize after each step to avoid overflow in coefficients • To make this operation valid, B2 must fit in the coefficient
Shift Multiplication • If a number is encoded in base B • Left shift multiplies by B • Right shift multiples by B • Yet another reason to use arrays vs. linked lists • If your problem requires many multiplications and divisions by a fixed number b, consider using b as the base for the number
Karatsuba Multiplication We can split two big numbers in half: X = X0 + B X1 and Y = Y0 + B Y1 Then the product XY is given by (X0 + BX1) (Y0 + BY1) This results in three terms: X0Y0 + B (X0Y1 + X1Y0) + B2(X1Y1) Look at the middle term. We can get the middle term almost for free by noting that: (X0 + X1) (Y0 + Y1) = X0Y0 + X0Y1 + X1Y0 + X1Y1 (X0 + X1) (Y0 + Y1) X0Y0 X1Y1 X0Y1 + X1Y0 = - -
Karatsuba Demonstration 12 * 34 Given: X1 = 1, X0 = 2, Y1 = 3, Y0 = 4 Calculate: X0Y0 = 8, X1Y1 = 3 (X0+X1)(Y0+Y1) = 3*7 = 21 Final middle term = 21 – 8 – 3 = 10 Solution: 8 + 10 * 101 + 3 * 102 = 408 • Notes: • Recursive, complexity is about Θ(n1.5) • There’s a better way: FFT based multiplication not taught here that is Θ (n log n) http://numbers.computation.free.fr/Constants/Algorithms/fft.html
Division • Algo: long division (Skiena and Revilla, pg 109): Iterate • shifting the remainder to the left including the next digit • Subtract off instances of the divisor Demo: http://www.mathsisfun.com/long_division2.html
Exponentiation • How do you calculate x256? • x256 = ((((((((x2)2)2)2)2)2)2) • What about x255? • Store x, x2, x4, x8, x16, x32, x64, x128 on the way up. • Complexity Θ(log n)
Computer Algebra • Applicable when you need an exact computation of something • 1/7 does not exactly equal 0.142857142857 • 1/7 * 7 = 1 • Or when you need consider polynomials with different variables
Data structures • How to represent x5 + 2x – 1? • Dense (array): 15 04 03 02 21 -10 • Sparse (linked list): [1,5][2,1][-1,0] • What about arbitrary expressions? • Solution: use an expression tree(e.g. a+b*c)
Simplification: Introduction • But CA systems have to deal with equivalency between different forms: • (x-2) (x+2) • X2 – 4 • So an idea is to push all equations to a canonical form. • Like Maple “simplify” • Question: how do we implement simplify?
Simplify - Transforming Negatives Why? • Kill off subtraction, negation • Addition is commutative To think about: How is this related to big integer computation?
Simplify – Leveling Operators • Combine like binary trees to n-ary trees
Simplify – Rational Expressions • Expressions with * and / will be rewritten so that there is a single / node at the top, with only * operators below
Exact / Rational Arithmetic • Need to store fractions. What data structure can be used here? • Problems: • Keep 4/7 as 4/7 but 4/8 as 1/2 • How to do this? • Simplification and factoring • Addition and subtraction need computation of greatest common denominator • Note division of a/b c/d is just a/b d/c
Invariants • Sometimes a problem is easier than it looks • Look for another way to define the problem in terms of indirect, fixed quantities
Invariants • Chocolate bar problem • You are given a chocolate bar, which consists of r rows and c columns of small chocolate squares. • Your task is to break the bar into small squares. What is the smallest number of splits to completely break it?
Variants of this puzzle? • Breaking irregular shaped objects • Assembling piles of objects • Fly flying between two trains • Adding all integers in a range (1 … 1000) • Also look for simplifications or irrelevant information in a problem
The Game of Squares and Circles • Start with c circles and s squares. • On each move a player selects two shapes. • These two are replaced by one according to the following rule: • Identical shapes are replaced with a square. Different shapes are replaced with a circle. • At the end of the game, you win if the last shape is a circle. Otherwise, the computer wins.
What’s the key? • Parity of circles and squares is invariant • After every move: • if # circles was even, still will be even • If # circles was odd, still will be odd
Sam Loyd’s Fifteen Puzzle • Given a configuration, decide whether it is solvable or not: • Key: Look for an invariant over moves 8 7 12 15 1 2 3 4 1 5 2 13 Possible? 5 6 7 8 6 3 9 14 9 10 11 12 4 10 11 13 14 15
Solution to Fifteen • Look for inversion invariance • Inversion: when two tiles out of order in a row inverted not inverted • For a given puzzle configuration, let N denote the sum of: • the total number of inversions • the row number of the empty square. • After a legal move, an odd N remains odd whereas an even N remains even. 11 10 10 11
Fifteen invariant • Note: if you are asked for optimal path, then it’s a search problem a b c d