170 likes | 287 Views
Internet Resources Discovery (IRD). Classic Information Retrieval (IR). Classical IR. Deals with Textual Information Retrieval. Exists for a few decades, mostly for text repositories. Pushed strongly with the development of the WWW for search engines. IR Topics and their Relationships.
E N D
Internet Resources Discovery (IRD) Classic Information Retrieval (IR) T.Sharon - A.Frank
Classical IR • Deals with Textual Information Retrieval. • Exists for a few decades, mostly for text repositories. • Pushed strongly with the development of the WWW for search engines. T.Sharon - A.Frank
IR Topics and their Relationships APPLICATIONS FOR IR HUMAN COMPUTER-INTERACTION FOR IR TEXTUAL IR Retrieved Models &Evaluation BibliographicSystems Interfaces &Visualization Improvements onRetrieval The Web EfficientProcessing Digital Libraries Multimedia Modeling& Searching IR Vocabulary: http://www.cs.jhu.edu/~weiss/glossary.html T.Sharon - A.Frank
Basic Architecture of an IR System Documents Queries Document Representation Query Representation Comparison T.Sharon - A.Frank
Interaction of the User with the IR System Retrieval database Browsing T.Sharon - A.Frank
What is a Query? • Input: • query terms/words, should appear in the text • possibly conditions between them • Output: • relevant documents • possibly ranked T.Sharon - A.Frank
Information Retrieval Systems • Generic information retrieval system select and return to the user desired documents from a large set of documents in accordance with criteria specified by the user. • Retrieval Functions • document search (ad-hoc)the selection of documents from an existing collection of documents. • document routing (filtering)the dissemination of incoming documents to appropriate users on the basis of user interest profiles. T.Sharon - A.Frank
The Process of Retrieving Information Text UserInterface Text Userneed Text Operations Logical view QueryOperations Indexing DB ManagerModule Userfeedback Inverted file Searching Index Retrieved docs TextDatabases Ranking T.Sharon - A.Frank
IR Ranking • Ranking algorithms • The central problem regarding IR systems is the issue of predicting which documents are relevant and which are not. • Ranking algorithms are at the core of IR systems. • A ranking algorithm operates on basic premises regarding document relevance according to distinct IR model. T.Sharon - A.Frank
A Taxonomy of IR Models Set Theoretic Classic Models Fuzzy Extended Boolean Boolean Vector Probabilistic User Task Retrieval: Search Routing Algebraic Structured Models Generalized Vector Latent Semantic Index Neural Networks Non-Overlapping Lists Proximal Nodes Browsing Browsing Probabilistic Inference Network Belief Network Flat Structure Guided Hypertext T.Sharon - A.Frank
Retrieval Models Associations Logical View of Documents U S E R T A S K T.Sharon - A.Frank
Query Language (1) • Keyword-based Querying • Single-word Queries • Context Queries • Phrase • Proximity • Boolean Queries • Natural Language T.Sharon - A.Frank
Query Language (2) • Pattern Matching • Words • Prefixes • Suffixes • Substring • Ranges • Allowing errors • Regular expressions T.Sharon - A.Frank
Query Language (3) • Structural Queries • Form-like fixed structures • Hypertext structure • Hierarchical structure T.Sharon - A.Frank
Structural Queries • form-like fixed structure, (b) hypertext structure, and (c) hierarchical structure T.Sharon - A.Frank
Hierarchical Structure An example of a hierarchical structure: the page of a book, its schematic view, and a parsed query to retrieve the figure T.Sharon - A.Frank