1 / 10

Architecture of a Medical Information Extraction System

Architecture of a Medical Information Extraction System. Dalila Bekhouche (dalila.bekhouche@ loria.fr) Yann Pollet (pollet@cnam.fr) Bruno Grilheres (bruno.grilheres@sysde.eads.net) Xavier Denis (xavier.denis@tiscali.fr). Index. Introduction. Information extraction.

strom
Download Presentation

Architecture of a Medical Information Extraction System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Architecture of a Medical Information Extraction System Dalila Bekhouche (dalila.bekhouche@ loria.fr) Yann Pollet (pollet@cnam.fr) Bruno Grilheres (bruno.grilheres@sysde.eads.net) Xavier Denis (xavier.denis@tiscali.fr)

  2. Index • Introduction • Information extraction • The architecture of the IE System • Extraction of lexical and medical terms • Evaluation of ICD-10 and CCMA results • Limits of this approach and future work

  3. 1- Introduction Database Problem: Difficult to access and exploit this amount of information • Variety of content • Specific terminology • The practionners use uncertain expressions and sens modifying Difficulties in understanding for most NLP tools

  4. 2- Information extraction Lexical Ressource Documents Free text Relevant information Extraction Domain knowledge • Aim • Identify and Extract relevant information from medical documents (examination report as colonoscopy) • How to identify the relevant information? • Relevant information: events and entities described in texts which concern the patient (signs, diagnosis, acts, results)

  5. 3- The architecture of the IE System • Date of examination • Document type • Signs • Diagnosis • Acts • Results • 1- Lexical level • Named entities • (Name,Medical terms) Documents • 2-Sub-sentence level • Signs, symptoms Generation Extraction Thesauri ICD- 10/Vidal/CCMA dictionary Database validation resources and rules

  6. 4- Extraction of the lexical terms Named entities(location, companies, organizations, dates) Mr <name> was addressed for a checkup by McGann Level 2 REGEX(words) and level 1 Mr <name> was addressed for a checkup by McGann Level 1 REGEX(words) or dictionary Mr Smith was addressed for a checkup by McGann

  7. 5- Extraction of the ICD-10 and CCMA Identify the various occurrences of these thesauri • 1- Preprocessing step: • Reduce the text and thesauri • Standardisation of words, removing irrelevant words • 2- Recognizing of the discminate terms • 3- Evaluate the Similarity (cosine measure) between the neighbouring terms in text and each candidate entry of the ICD-10 in relationship with indexing term ICD-10: International classification of the diseases CCMA: Common Classification of the Medical Acts

  8. 6- Evaluation of ICD-10 and CCMA results valid annotations found by the system valid annotations found by the practitionner Precision = valid annotations found by the system all annotations found by the system Recall = • 50% correct annotations. After adding knowledge, the precision increases up to 87,7% • Recall is approximatively the same, it represents problems due to ambiguous words.

  9. 7- Limits of this approach and future work • French medical texts only and specifics domains colonoscopy & oncology records. • Simple sentences as medical records but may have difficulties to analyse complex sentences needing a deep syntactic analysis • we will focus on the generation and acquisition steps. • Taking into account synonyms and feedback users

  10. Thank you! dalila.bekhouche@ loria.fr PSI (Perception, system, information) Insa Rouen, Place E. Blondel, 76130 Mont St Aignan, France

More Related