150 likes | 282 Views
Tiran Software. -TURKUAZ Project- RadeX Tahir Bilal Onur Deniz Soner Kara M. Mert Karadağlı. Assistant: Umut Eroğul Instructor: Meltem T. Yöndem. Outline. Problem Definition Important Aspects Our Approach General Structure Analyzer Component Searcher Component
E N D
Tiran Software -TURKUAZ Project- RadeX Tahir Bilal Onur Deniz Soner Kara M. Mert Karadağlı Assistant: Umut Eroğul Instructor: Meltem T. Yöndem
Outline • Problem Definition • Important Aspects • Our Approach • General Structure • Analyzer Component • Searcher Component • Current Status • Prototype • Tool and Resources • Q/A
Problem Definition • Billions of radiology reports • Unfortunately, they are stored in free-text format • Hard to search and retrieve • Need for searchable information
Important Aspects • Text Mining • NLP • Information Extraction • Morphological Analysis • Named Entity Recognition • Machine Learning • Neural Networks, Decision Trees ...
Our Approach RadeX, Radiology Data Extractor will enable.. • Modular machine learning component • Support for internal/external dictionary connection • Template-based approach for finalizing
General Structure (cont.) • Analyzer Component • Preprocess free text • Look-up internal and external lexicons • Gives semantic to words • Extracts searchable data • Searcher Component • Send query strings to database • Retrieve corresponding information
Current Status • Preprocessing. • Connecting and using external sources. • Database implementation. • Applying SVM to unrelated but tagged corpus.
Current Status (cont.) • Mapping Turkish terms to English translations. • Finding stem of unknown words. • Constructing lexicons. • Features of verbs, adjectives, nouns...
In Prototype we will be able to... • ..decompose reports into sub-parts, sentences and words, • .. analyze words using Zemberek and a stemmer. • .. give semantics to words via internal/external lexicons • .. extract simple information using pre-defined templates
Tools & Resources • SVM-Light • WordNet • JWNL • TDK / Zargan • Zemberek, • PostgreSQL