1 / 10

Disambiguating Queries for Geographic Information Retrieval

This thesis proposal explores the challenges and potential solutions for enhancing retrieval effectiveness in Geographic Information Retrieval (GIR) through the use of geospatial query expansion. The study will focus on the goals of an IR system, determining relevancy, evaluating IR systems, and leveraging geospatial information to increase retrieval effectiveness. The proposed research will involve building a Gazetteer, modifying the Query Analyzer, conducting experiments, and analyzing results to validate the hypothesis.

halmccoy
Download Presentation

Disambiguating Queries for Geographic Information Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Disambiguating Queries for Geographic Information Retrieval Carolyn Hafernik Thesis Proposal May 10, 2006 Computer Science Advisor: Lisa Ballesteros

  2. Information Retrieval (IR) • What are the goals of an IR system? • What is a relevant document? • How does one determine which documents are relevant? • How are IR systems evaluated?

  3. Geographic Information Retrieval (GIR) • GIR is an extension of IR • It aims to use geospatial information to help improve retrieval effectiveness • What makes GIR challenging? • Poor query specification • Ambiguity of language • No central repository for geospatial information

  4. Map from www.lib.utexas.edu/maps/usmet.html Locations Population statistics Name variations Nearby landmarks How can geospatial information be used to increase retrieval effectiveness given a query? Example query: “Hiking near the Bay Area” Geospatial Information

  5. <top> <num> GC001 </num> <orignum> C084 </orignum> <EN-title> Shark Attacks off Australia and California </EN-title> <EN-desc> Documents will report any information relating to shark attacks on humans. </EN-desc> <EN-narr> Identify instances where a human was attacked by a shark, including where the attack took place and the circumstances surrounding the attack. Only documents concerning specific attacks are relevant; unconfirmed shark attacks or suspected bites are not relevant. </EN-narr> <!-- NOTE: This topic has added tags for GeoCLEF --> <EN-concept> Shark Attacks </EN-concept> <EN-spatialrelation> near </EN-spatialrelation> <EN-location> Australia </EN-location> <EN-location> California </EN-location> </top> <top> <num> GC004 </num> <orignum> C126 </orignum>- <EN-title> Actions against the fur industry in Europe and the U.S.A. </EN-title> <EN-desc> Find information on protests or violent acts against the fur industry. </EN-desc> <EN-narr> Relevant documents describe measures taken by animal right activists against fur farming and/or fur commerce, e.g. shops selling items in fur. Articles reporting actions taken against people wearing furs are also of importance. </EN-narr> <!-- NOTE: This topic has added tags for GeoCLEF --> <EN-concept> Animal Rights Actions against the fur industry </EN-concept> <EN-spatialrelation> in </EN-spatialrelation> <EN-location> Europe </EN-location> <EN-location> United States </EN-location> </top> Sample GeoCLEF 2005 Topics

  6. Previous Work • GeoCLEF 2005 • Common approaches • Places to store information • Named Entity Recognition • Query Expansion • Traditional IR approaches

  7. Hypothesis • My hypothesis is that using geospatial information for query expansion and to re-weight geospatial components for each query will improve retrieval effectiveness. • Improvement will occur because the expanded query will provide the system with more specific information than that contained in the original query.

  8. Timeline • Timeline • Fall Semester • Build the Gazetteer • Modify Query Analyzer • Design Experiments • Do More Background Reading • Start writing thesis • January Term • Run experiments • Continue writing thesis • Spring Semester • Analyze results • Run more experiments (If necessary) • Finish thesis

  9. References • [1] Davide Buscaldi, Paolo Rosso, Emilio Sanchia Arnal. A WordNet-based Query Expansion method for Geographical Information Retrieval. 2005. • [2] Nuno Cardoso, Bruno Martins, Marcirio Silveira Chaves, Leonardo Andrade, Mario J. Silva. The XLDB Group at GeoCLEF 2005. 2005. • [3] O. Ferrandez, Z. Kozareve, A. Toral, E. Noguera, A. Montoyo, R. Munoz, Fernando Llopis. Univeristy of Alicante at GeoCLEF 2005. 2005. • [4] Daniel Ferres, Alicia Ageno, Horacio Rodriguez. The GeoTALP-IR System at GeoCLEF-2005: Experiments Using a QA-based IR System, Linguistic Analysis, and a Geographical Thesaurus. 2005. • [5] Fredric Gey, Ray Larson, Mark Sanderson, Hideo Joho, Paul Chlough. GeoCLEF: the CLEF 2005 Cross-Language Geographic Information Retrieval Track Overview. 2005. • [6] Fredric Gey, Vivien Petras. Berkeley2 at GeoCLEF: Cross-Language Geographic Information Retrieval of German and English Documents. 2005. • [7] Rocio Guillen. CSUSM Experiments in GeoCLEF2005: Monolingual and Bilingual Tasks. 2005. • [8] Baden Hughes. NICTA i2d2 at GeoCLEF 2005. 2005. • [9] Andras Kornai. MetaCarta at GeoCLEF 2005. 2005. • [10] Sara Lana-Serrano, Jose M. Goni-Menoyo, Jose C. Gonzalez-Cristobal. Miracle’s 2005 Approach to Geographical Information Retrieval. 2005. • [11] Ray R. Larson. Chesire II at GeoCLEF: Fusion and Query Expansion for GIR. 2005. • [12] Jochen L. Leidner. Preliminary Experiments with Geo-Filtering Predicates for Geographic IR. 2005. • [13] Johannes Leveling, Sven Hartrumpf, Dirk Veiel. University of Hagen at GeoCLEF 2005: Using Semantic Networks for Interpreting Geographical Queries. 2005.

  10. Thank you! Questions? Comments?

More Related