400 likes | 485 Views
Lecture # 31 Category Trees. Category Trees. Binary Trees. How many steps to reach a leaf?. 4. 16. Binary Trees. How many steps to reach a leaf?. log 2 (N). N. 4 branch trees. 16. How many steps to reach a leaf?. 2. 4 branch trees. N. How many steps to reach a leaf?.
E N D
Lecture # 31 Category Trees
Binary Trees How many steps to reach a leaf? 4 16
Binary Trees How many steps to reach a leaf? log2(N) N
4 branch trees 16 How many steps to reach a leaf? 2
4 branch trees N How many steps to reach a leaf? Log4(N)
What is the General Algorithm? N How many steps to reach a leaf?
What is the General Algorithm? M N How many steps to reach a leaf? LogM(N), where M = “branching factor?
4 branch trees If I double the number of branches, what happens to the number of steps to reach a leaf?
4 branch trees If I double the number of branches, what happens to the number of steps to reach a leaf? It is cut in half
10 branch trees Log10(N) Count digits
Binary Trees How many steps to reach a leaf? 16 leaves
Binary Trees Unbalanced Balanced
Binary Trees Really Unbalanced 16 leaves how many steps? Average of 8 - N/2
N branches • If I have N leaves, why not just have N branches in the tree? • I can reach each leaf in one step • The time to choose a leaf • Binary tree • constant time • N-ary tree (N branches) • N checks (one for each branch)
What if I don’t know which branch to choose? Try all of them Average N/2
Trees - time to find things • Number of branches (B) • logB(N) • Too many branches • searching for branches is a problem • Too few branches • too many steps to a leaf • Balance • Probability of correct choice
Examples of trees • Dewey • Library of congress • Biology • Yahoo • Menus • File system
Dewey • 000 Computers, information, & general reference • 100 Philosophy & psychology • 200 Religion • 300 Social sciences • 400 Language • 500 Science • 600 Technology • 700 Arts & recreation • 800 Literature • 900 History & geography
Dewey • 500 Science • 510 Mathematics • 520 Astronomy • 530 Physics • 540 Chemistry • 550 Earth sciences & geology • 560 Fossils & prehistoric life • 570 Biology & life sciences • 580 Plants (Botany) • 590 Animals (Zoology)
Dewey • 500 Science • 550 Earth sciences & geology • 551 Geology, hydrology, meteorology • 552 Petrology • 553 Economic geology • 554 Earth sciences of Europe • 555 Earth sciences of Asia • 556 Earth sciences of Africa • 557 Earth sciences of North America • 558 Earth sciences of South America • 559 Earth sciences of other areas
What are the numbers in the Dewey tree? • A path name • How many possibilities? • 1000 • Where do we get more? • 343.123 c
Using the Dewey tree • What is the Dewey Decimal number for Ostriches? • If we don’t know how to choose then how do we find things? • Search to get Dewey decimal number
What good is the system? • If we can’t correctly choose the path to a book, then isn’t the search just linear (N/2)? • Unique location for each book • Why not just assign them a number as each book comes into the library? • Browsing - keep related books together
Library of Congress • How many books? ~ 20 Million • How many maps, documents, videos, photos? ~ 100 Million
Library of Congress • A -- GENERAL WORKS • B -- PHILOSOPHY. PSYCHOLOGY. RELIGION • C -- AUXILIARY SCIENCES OF HISTORY • D -- HISTORY: GENERAL AND OLD WORLD • E -- HISTORY: AMERICA • F -- HISTORY: AMERICA • G -- GEOGRAPHY. ANTHROPOLOGY. RECREATION • H -- SOCIAL SCIENCES • J -- POLITICAL SCIENCE • K -- LAW • L -- EDUCATION • M -- MUSIC AND BOOKS ON MUSIC • N -- FINE ARTS • P -- LANGUAGE AND LITERATURE • Q -- SCIENCE • R -- MEDICINE • S -- AGRICULTURE • T -- TECHNOLOGY • U -- MILITARY SCIENCE • V -- NAVAL SCIENCE • Z -- BIBLIOGRAPHY. LIBRARY SCIENCE. INFORMATION RESOURCES (GENERAL)
Library of Congress • A -- GENERAL WORKS • B -- PHILOSOPHY. PSYCHOLOGY. RELIGION • C -- AUXILIARY SCIENCES OF HISTORY • D -- HISTORY: GENERAL AND OLD WORLD • E -- HISTORY: AMERICA • F -- HISTORY: AMERICA • G -- GEOGRAPHY. ANTHROPOLOGY. RECREATION • H -- SOCIAL SCIENCES • J -- POLITICAL SCIENCE • K -- LAW • L -- EDUCATION • M -- MUSIC AND BOOKS ON MUSIC • N -- FINE ARTS • P -- LANGUAGE AND LITERATURE • Q -- SCIENCE • R -- MEDICINE • S -- AGRICULTURE • T -- TECHNOLOGY • U -- MILITARY SCIENCE • V -- NAVAL SCIENCE • Z -- BIBLIOGRAPHY. LIBRARY SCIENCE. INFORMATION RESOURCES (GENERAL) Why do these subjects get so much space in the tree? What is the purpose of the Library of Congress?
Organizing trees of information • Branches • Balance • Can the user make a correct choice • Information about each choice
Which of these things is not like the others? • If you are in first grade? • If you are a biologist?
Which of these things is not like the others? • Studying the arctic • Studying child/adult animal behavior
Tree problems Literature Novels Whales . . . Industry Sea Whaling . . . Biology Mammals Whales . . .
Why are the minor items here? Click Here Yahoo.com - a huge tree of WWW information What are the major items for?
What is this? • A path name in a tree • Why doesn’t “Animals” follow “Science”?
What is this section for? • To identify the user’s purpose
Tree problems Literature Novels Whales . . . Industry Sea Whaling . . . Biology Mammals Whales . . .
Aliases - connect tree branches Literature Novels Whales . . . Industry Sea Whaling . . . Biology Mammals Whales Stories Industry . . .
Aliases - Windows Shortcuts • Named objects that contain the path name of other objects • Lets us organize a tree in many ways Shortcut symbol
Challenge • If you run a commercial art house and want to organize all of your pictures into a tree • If you are creating a new program and want to organize your menu items into a tree • Who has the answer to the organization? • The users
How to get user input • Get samples • 100 pictures • Names of menu items on 3x5 cards • Give them to users • Tell them to organize them into 7-12 stacks • Tell them to write name on each stack • Video tape the process and ask them to talk about what they are doing
Trees - time to find • Number of branches (B) • logB(N) • Too many branches • searching for branches is a problem • Too few branches • too many steps to a leaf • Balance • Probability of correct choice • “Similar” things together • Aliases to apply multiple organizations