1 / 26

Lecture 3 Nearest Neighbor Algorithms

Lecture 3 Nearest Neighbor Algorithms. Shang-Hua Teng. What is Algorithm?. A computable set of steps to achieve a desired result from a given input Example: Input: An array A of n numbers Desired result Pseudo-code of Algorithm SUM. Pseudo-code of Algorithm SUM. Complexity:

yered
Download Presentation

Lecture 3 Nearest Neighbor Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 3Nearest Neighbor Algorithms Shang-Hua Teng

  2. What is Algorithm? • A computable set of steps to achieve a desired result from a given input • Example: • Input: An array A of n numbers • Desired result • Pseudo-code of Algorithm SUM

  3. Pseudo-code of Algorithm SUM • Complexity: • Input Size n • Number of steps: n-1 additions

  4. Example 2:Integer Multiplicationc = a b • When do we need to multiply two very large numbers? • In Cryptography and Network Security • message as numbers • encryption and decryption need to multiply numbers

  5. ************ ************ ************ ************ ************ ************ ************************ ************ ************ ************ ************ ************ ************ ************ ************ How to multiply 2 n-bit numbers

  6. Asymptotic Notation of Complexity • As input size grow, how fast the running time grow. • T1(n) = 100 n • T2(n) = n2 • Which algorithms is better? • When n < 100 is small then T2 is smaller • As n becomes larger, T2 grows much faster • To solve ambitious, large-scale problem, algorithm1 is preferred.

  7. Asymptotic Notation(Removing the constant factor) • The Q Notation Q(g(n)) = { f(n): there exist positive c1 and c2 and n0 such that for all n > n0} • For example T(n) = 4nlog n + n = Q(nlog n) • For example n – 1 = Q(n)

  8. Asymptotic Notation(Removing the constant factor) • TheBig-O Notation O(g(n)) = { f(n): there exist positive cand n0 such that for all n > n0} • For example T(n) = 4nlog n + n = O(nlog n) • But also T(n) = 4nlog n + n = O(n2)

  9. Nearest Neighbor Problem:General Formulation

  10. Nearest Neighbor Problem

  11. Applications • Points could be web-page, closest neighbor is the most similar web-page • Points could be people, closest neighbor could be the best friend • Points could be biological spices, the closest neighbor could be the closest spices • …

  12. O(dn2) time Algorithm Why O(dn2) time?

  13. Can We do better? • Yes, Handout #4, by Jon Louis Bentley

  14. One-Dimensional Geometry If we can order points from small to large, then we just need to look at the left neighbor and right neighbor of each point to find its nearest neighbor

  15. Reduce to Sorting • Input: Array A[1...n], of elements in arbitrary order; array size nOutput:  Array A[1...n] of the same elements, but in the non-decreasing order

  16. Divide and Conquer • Divide the problem into a number of sub-problems (similar to the original problem but smaller); • Conquer the sub-problems by solving them recursively (if a sub-problem is small enough, just solve it in a straightforward manner. • Combine the solutions to the sub-problems into the solution for the original problem

  17. Algorithm Design Paradigm I • Solve smaller problems, and use solutions to the smaller problems to solve larger ones • Divide and Conquer • Correctness: mathematical induction

  18. Merge Sort • Divide the n-element sequence to be sorted into two subsequences of n/2 element each • Conquer: Sort the two subsequences recursively using merge sort • Combine: merge the two sorted subsequences to produce the sorted answer • Note: during the recursion, if the subsequence has only one element, then do nothing.

  19. Merge-Sort(A,p,r)A procedure sorts the elements in the sub-array A[p..r] using divide and conquer • Merge-Sort(A,p,r) • ifp >= r, do nothing • ifp< rthen • Merge-Sort(A,p,q) • Merge-Sort(A,q+1,r) • Merge(A,p,q,r) • Starting by calling Merge-Sort(A,1,n)

  20. A = MergeArray(L,R)Assume L[1:s] and R[1:t] are two sorted arrays of elements: Merge-Array(L,R) forms a single sorted array A[1:s+t] of all elements in L and R. • A = MergeArray(L,R) • fork 1tos + t • do if • then • else

  21. Complexity of MergeArray • At each iteration, we perform 1 comparison, 1 assignment (copy one element to A) and 2 increments (to k and i or j ) • So number of operations per iteration is 4. • Thus, Merge-Array takes at most 4(s+t) time. • Linear in the size of the input.

  22. Merge (A,p,q,r)Assume A[p..q] and A[q+1..r] are two sorted Merge(A,p,q,r) forms a single sorted array A[p..r]. • Merge (A,p,q,r)

  23. Merge-Sort(A,p,r)A procedure sorts the elements in the sub-array A[p..r] using divide and conquer • Merge-Sort(A,p,r) • ifp >= r, do nothing • ifp< rthen • Merge-Sort(A,p,q) • Merge-Sort(A,q+1,r) • Merge(A,p,q,r)

  24. Running Time of Merge-Sort • Running time as a function of the input size, that is the number of elements in the array A. • The Divide-and-Conquer scheme yields a clean recurrences. • Assume T(n) be the running time of merge-sort for sorting an array of n elements. • For simplicity assume n is a power of 2, that is, there exists k such that n = 2k .

  25. Recurrence of T(n) • T(1) = 1 • for n > 1, we have if n = 1 if n > 1

  26. Solution of Recurrence of T(n) T(n) = 4nlog n + n = O(nlog n) • Picture Proof by Recursion Tree

More Related