1 / 20

Information Retrieval and its Application in Biomedicine

Information Retrieval and its Application in Biomedicine. Sept 4 Introduction. Hong Yu 1,2 , PhD Susan McRoy 1 , PhD 1 Department of Computer Science 2 Department of Health Sciences University of Wisconsin-Milwaukee. What is Information Retrieval?.

Download Presentation

Information Retrieval and its Application in Biomedicine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Retrieval and its Application in Biomedicine Sept 4 Introduction Hong Yu1,2, PhD Susan McRoy1, PhD 1Department of Computer Science 2Department of Health Sciences University of Wisconsin-Milwaukee

  2. What is Information Retrieval? • The field concerned with the acquisition, organization, and searching of knowledge-based information. (Hersh, 2003)

  3. Speed Up Communication

  4. Information • World Wide Web • Company Documentations • Drug Descriptions • Medical Records • Books • Everything that is text, image, video, and sound, and that can be transformed digitally

  5. Information in Biomedicine • Literature (over 17 million publications) • WWW • Electronic medical records • Genomics data • DNA sequences, etc. • Knowledge representation • Gene Ontology • Company databases • Micromedex drug database

  6. IR in Biomedicine • Index Medicus (Billings 1879) • MEDLARS (NLM 1966) • SAPHIRE (Hersh 1990) • PubMed (NLM 1996) • Arrowsmith (Smalheiser 1998) • BioText (Hearst 2003) • BioMedQA (Yu 2006)

  7. Electronic and Open Publishing • Internet and Web have a profound impact on the publishing of knowledge-based information • Most of literature can be electronically available • Open-access • The Bethesda Statement on Open Access Publishing (http://www.earlham.edu/~peters/fos/bethesda.htm) (April 11, 2003) • The Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities (http://www.zim.mpg.de/openaccess-berlin/berlindeclaration.html). (2003) • PubMedCentra (NLM 2004)

  8. Quality of Information • A lack of quality control • Anyone can publish online • A wealthy of studies concluded that Web has a poor quality for healthcare information • Readability • Hard to read

  9. Information Needs and Seeking • Unrecognized needs • Clinicians unaware of information needs or knowledge deficit • Recognized needs • Clinicians aware of needs but may or may not pursue them • Pursued needs • Information seeking occurs but may or may not be successful • Satisfied needs • Information seeking successful

  10. Evidence-Based Medicine

  11. What You Will Learn • IR algorithms • Indexing • Query and Retrieval • Evaluation • Text Classification • XML retrieval • Web retrieval

  12. What You Will Learn (Cont.) • Open-Source IR tools • What open-source IR tools are available • Indexing/retrieval • Part-of-speech and syntactic parsing • Semantic parsing • Discourse relations • Machine-learning classifiers • How to use the tools?

  13. What You Will Learn (Cont.) • State of the art IR systems • Baruch 1965 [BLIMP http://blimp.cs.queensu.ca/index.html] • SAPHIRE (Hersh 1990) • Retrieval • MedLEE (Friedman 1994) • Extraction • PubMed (NLM 1997) • ARROSMITH Systems (Smalheiser 1998) • Hidden Relation Discovery Tool • GENIES (Friedman 2001) • Extraction

  14. BioNLP Systems • BioText (Hearst 2003http://biotext.berkeley.edu/) • Retrieval+Categorization • GeneWays (Rzhetsky 2004 http://geneways.genomecenter.columbia.edu/) • Extraction+Visualization • TextPresso (Muller 2004http://www.textpresso.org/) • Retrieval+Extraction • iHOP (Hoffman and Valencia 2005http://www.ihop-net.org/UniPub/iHOP/) • Retrieval • BioMedQA (Yu 2006 http://monkey.ims.uwm.edu/MedQA) • Question Answering

  15. Advanced NLP applications

  16. Beyond text: Image and Video • Image classification • Finding concepts in captions and annotations • Machine learning on textual & visual features • Determining salient features in text and image separately and merging the results • Extracting text from image • Understanding and correcting OCR (handwriting, equations) • Finding text in images • Finding document text related to illustrations • Video retrieval

  17. Beyond Extraction: Experimental Tools

  18. Resources • Annotated collections (GENIA, Medstract, Yapex …) • Ontologies, tools, knowledge bases … • Publications, Conferences, Evaluations … • Centres and web portals

  19. What We Provide • Textbook • Christopher D. Manning, Prabhakar Raghavan and Hinrich Schutze. Introduction to Information Retrieval. Cambridge University Press, 2007 • http://www-csli.stanford.edu/~schuetze/information-retrieval-book.html • Office hour: • Tuesdays, 3-4 pm EMS 710 and by appointment • Hong Yu, 414-229-3344 • Susan McRoy, 414-229-6695

  20. What We Expect • Undergraduate: • 30% Homework, 35% Midterm exam, 35% Final exam or project • Graduate: • 20% Midterm exam, 40% Homework, 40% Project: The project may be done individually or in a team of 2-3 people. The final project will include a software system, a 2-3 page written project report, and an oral presentation. The report should describe the problem, the approach, and evaluation and should cite related work where appropriate.

More Related