1 / 18

Data Structures and Analysis (COMP 410)

Data Structures and Analysis (COMP 410). David Stotts Computer Science Department UNC Chapel Hill. Design Problem. Real Problem. Type ahead Like on google search, phone typing…

dobry
Download Presentation

Data Structures and Analysis (COMP 410)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Structures and Analysis(COMP 410) David Stotts Computer Science Department UNC Chapel Hill

  2. Design Problem

  3. Real Problem Type ahead • Like on google search, phone typing… • you type a few chars and the program fills in a list of possible choices for you… based on the prefix you have typed • Keep typing more chars, the choices narrow and change Design a data structure that will let you do this Describe the time complexity of using it… • searching it as typing is done, • generating alternatives, etc.

  4. Take some time Discuss an approach with your neighbor In 5-10 mins we will discuss ideas as a class

  5. Basic idea Let’s not use node to store a whole word Use child link to represent a char typed Path is then the word <root> t n a a a e r o s e n tar as to an a w tea new

  6. Basic idea… This tree encodes (stores) these words: tar, tan, tea, to, ton, toe, a, an, ant, as, net, nest, new, no <root> t n a a n e a o tan o r s e n no as tar to an a n w t s e t tea ton ant new t net toe nest

  7. This has a name Trie Pronounced “try” or “tree”, both ways Or “trie tree” tree-tree, try-tree Comes from “ reTRIEval ” Used for prefix-based retrieval of strings formed over an alphabet

  8. Representation How many children at each node? As many as there are chars you can type Let’s say 26 for this example node { string val= null; node[26] child = new [null,null,…,null]; booleanisWord = false; }

  9. Representation node { string val= null; node[26] child = new [null,null,…,null]; booleanisWord = false; } val: isWord: false . . . child: 0 1 2 3 4 5 6 7 . . . 22 23 24 25

  10. Representation val: val: “be” val: “a” val: “b” isWord: true isWord: false isWord: false isWord: true 0 1 2 3 4 5 6 7 . . . 22 23 24 25 0 1 2 3 4 5 6 7 . . . 22 23 24 25 0 1 2 3 4 5 6 7 . . . 22 23 24 25 0 1 2 3 4 5 6 7 . . . 22 23 24 25 . . . . . . . . . . . . child: child: child: child:

  11. Representation val: “be” val: “b” val: “a” val: isWord: true isWord: false isWord: false isWord: true 0 1 2 3 4 5 6 7 . . . 22 23 24 25 0 1 2 3 4 5 6 7 . . . 22 23 24 25 0 1 2 3 4 5 6 7 . . . 22 23 24 25 0 1 2 3 4 5 6 7 . . . 22 23 24 25 <root> . . . . . . . . . . . . child: child: child: child: b a a e be

  12. Analysis Big Oh time complexity is always expressed in terms of some problem size Here the problem size is not the number of words encoded in the tree, like we say for BST Rather we choose M, the length of a word being inserted or searched for

  13. Analysis The worst case time needed to find a word of length M is… O(M) This is true if the tree contains 10 words or 10 million words Length of the longest path in the tree is length of the longest word stored in the tree

  14. Analysis If a word of length M can be made from N different characters (like 26 in the alphabet) then the number of possible nodes in the data structure is M^N A trie to store words 20 character long in an alphabet of 52 chars (upper and lower) is 20^52

  15. Analysis Note that if we store 26 character words and limit us to lower case we get 26^26 possible nodes… This is slightly worse than 26 ! 26 * 26 * 26 * … * 26 Is worse than 26 * 25 * 24 * … * 2 * 1

  16. Analysis How bad is N!? Lets compare let N = 20 2^N is 2^20 is about a million N! is 20! is 2.432902e+18 2,432,902,000,000,000,000 2,432,902,000,000 * a million 2.4 trillion millions

  17. So what? A trie made to hold 20 character words… Made from 20 lower case characters Worst case find operation is O(20) or O(N) Worst case space… O(N!) So -- its very fast to use -- Impossible (very impractical) to build in time and space

  18. END Beyond this is just templates

More Related