1 / 47

Boolean, bibliometrics, and beyond

Boolean, bibliometrics, and beyond. Part 1. LIS 670 donna Bair-Mundy. Our roadmap. Boolean. Boolean exercises. Fuzzy sets. Bibliometrics. Boolean. Boolean algebra. Developed by George Boole, an English mathematician, circa 1850 Set theory Boolean logic is binary

kourtney
Download Presentation

Boolean, bibliometrics, and beyond

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Boolean, bibliometrics, and beyond Part 1 LIS 670 donna Bair-Mundy

  2. Our roadmap • Boolean • Boolean exercises • Fuzzy sets • Bibliometrics

  3. Boolean

  4. Boolean algebra • Developed by George Boole, an English mathematician, circa 1850 • Set theory • Boolean logic is binary • Widely used in electronic design • Widely used in information retrieval systems

  5. Two ways of defining a set Enumeration (listing the elements) A = {1, 2, 3, 4, 5} • Specification of a distinguishing property all elements of the set have in common • B = {x | x is a prime number}

  6. Set operators (1) Given sets A = {1, 2, 3, 4, 5} B = {1, 3, 5, 7} C = {6, 7, 8} Union - produces a set containing all members of both operand sets A  C = {1, 2, 3, 4, 5, 6, 7, 8} Set Set Resultant Set

  7. Set operators (2) Given sets A = {1, 2, 3, 4, 5} B = {1, 3, 5, 7} C = {6, 7, 8} Intersection - produces a set containing members in the first set that also occur in the second set A  B = {1, 3, 5}

  8. Set operators (3) Given set A = {1, 2, 3, 4, 5} Complement - produces a set containing all members of the universal set that are not a member of the operand set If D is the universal set of all positive integers, then: A = {6, 7, 8, …}

  9. Boolean operators Words and symbols to denote set operators. Boolean OR AND NOT Set Theory Union Intersection Complement Symbol   - Algebraic symbol + * -

  10. Algebraic operations on sets Given sets A = {1, 2, 3, 4, 5} B = {1, 3, 5, 7} C = {6, 7, 8} A * (B + C) = A * B + A * C AAND(BORC) {1,2,3,4,5}AND{1,3,5,6,7,8} 1, 3, 5 (A AND B)OR (A AND C) {1,3,5} OR {null} 1, 3, 5 = = =

  11. Venn diagrams John Venn Charles Dodgson Set 1 Set 2 Set 3

  12. Venn diagram - OR Poodles Retrievers Poodles OR Retrievers yields all documents about either poodles or retrievers

  13. Venn diagram - AND Poodles Retrievers Poodles AND Retrievers yields all documents that deal with both poodles and retrievers

  14. Venn diagram - NOT Poodles Retrievers Poodles NOT Retrievers yields documents about poodles but not about retrievers

  15. Venn diagram - Exclusive OR Poodles Retrievers Poodles XOR Retrievers yields all documents that deal with either poodles or retrievers but not both

  16. Rules of precedence Complementation (NOT) Intersection (AND) Union (OR) dogsORcatsANDfleas will be read as dogsOR (catsAND fleas)

  17. Specifying order of performance (dogs OR cats) AND fleas

  18. Boolean arithmetic Set A Set B Find A AND B * Set A Set B Retrieve? Set A Set B Retrieve? Yes Yes Yes 1 1 1 Yes No No 1 0 0 No Yes No 0 1 0 No No No 0 0 0

  19. Boolean searching: advantages Ideally suited for inverted file indexes - each index entry and set of pointers constitutes a set Cats 1,3,7,9,13 Dogs 2,5,6,15 Fleas 6,7,9,17 Gnus 19,27 Guppies 4,14,18 Hamsters 22,25,31 Allows user to broaden (using OR) or narrow (using AND, NOT) searches

  20. Boolean Exercises

  21. The Scenario – part I You are the librarian at the Happy Broccoli School of Culinary Arts. Chief Kweezee is planning the menus for this week’s demonstrations. He comes to the reference desk and asks you to search the recipe database for him.

  22. The Scenario – part II The search command for this database is FIND followed by key words. The system accommodates Boolean operators and allows parentheses.

  23. The Scenario – part III To impress the chef, who stays to watch you search, you formulate a single search statement for each menu.

  24. Sample record Sample Boolean exercise Menu Cuisine: Mexican Title: Enchilada Ingredients: Corn tortillas, tomato sauce, chili peppers, beans, onions, garlic, cilantro… Mexican Cuisine Enchilada Refried beans Search statement FIND mexican AND (enchilada OR (refried and beans))

  25. FIND mexican AND (enchilada OR (refried and beans)) refried enchilada enchilada mexican beans

  26. Exercise 1 Menu Mexican Cuisine Mexican casserole Tostada FIND

  27. Exercise 2 Menu Italian Cuisine Pasta with grilled artichoke hearts Baked garlic FIND

  28. Menu Exercise 3 Greek Cuisine Vegetarian moussaka Greek salad featuring kalamata olives FIND

  29. Exercise 4 Menu Chinese Cuisine Hot and sour soup Fried eggplant Tofu and broccoli dish FIND

  30. Exercise 5 Menu Indian Cuisine Eggplant curry Samosa Raita Tamarind sauce FIND

  31. Boolean searching: disadvantages (1) • Counterintuitive • AND retrieves fewer items • Two-valued logic - items meet criteria or they do not • Good for computers • Does not reflect user relevancy determinations

  32. Boolean searching: disadvantages (2) Research topic: Digital music libraries Documents Current research on digital music libraries Introduction to digital libraries Information architecture in the digital environment Libraries of ancient Babylonia

  33. Fuzzy sets

  34. Binary versus fuzzy sets Test each record against query Ri = any record Q = user query Yes or no: retrieved or not Binary set S(Ri x Q) 0,1 Retrieval set for query Q is all records Ri such that S(Ri x Q) = 1 Brackets indicate range Fuzzy setS(Ri x Q)[0,1] S expresses not whether or not R is in the set but the degree of strength of the association of R with the set.

  35. Fuzzy set highly relevant non-relevant 1 0

  36. FIND Agni Vedic fire ritual highly relevant non-relevant 0 1 Analysis of the AgniVedicfireritual (1) Characteristics of Agni (0.25 Structural analysis of a Vedicfireritual (0.75) Analysis of a fireritual of India (0.5)

  37. Implementing fuzzy sets (1) User enters list of words FIRE, RITUAL, SACRIFICE Retrieval system examines each record or document in the database Computes score by number of query words that appear in the document System presents ordered list of documents, along with their scores

  38. Implementing fuzzy sets (2) • FIRE, RITUAL, SACRIFICE RankTitle 100% The firesacrificeritual of early Vedic period India 66% Fire and sacrifice in proto-Indo-European society 33% How to build a fire the Girl Scout way

  39. Implementing fuzzy sets (3) User enters list of words FIRE, RITUAL, SACRIFICE Retrieval system examines each document or record in the database, computing score for that item by adding 1 for each time any of the words on the user's list appears in the document or record System presents ordered list of documents, along with their scores

  40. Implementing fuzzy sets (4) • FIRE, RITUAL, SACRIFICE Rank Document     The firesacrificeritual of early Vedic period India. The firesacrificeritual is one of many sacrificerituals observed as being performed… Fire and sacrifice in proto-Indo-European society. Discussion of the role of fire…  How to build a fire the Girl Scout way. Demonstrates fire building…

  41. Fuzzy sets in Voyager (1) • Agni, vedic

  42. Fuzzy sets in Voyager (2) • Agni, vedic

  43. Fuzzy sets in Voyager (3) • Agni, vedic

  44. Fuzzy sets in Voyager (4) • Agni, vedic

  45. Fuzzy sets in Voyager (5) • Agni, vedic

  46. Field-weighting terms Terms weighted by fields in which they occur Title: Winning-induced euphoria in tiddlywinks players 5 5 Descriptors: Euphoria; Tiddlywinks Abstract: The authors studied the brain waves of 175 tiddlywinks winners and found euphoria induced by winning lasted an average of 3 hours. 2 Text: Researchers have long held that tiddlywinks, unlike other sports, do not induce a significant affective… 1

  47. User-weighting terms Terms weighted at time of search by user * Weighted term

More Related