1 / 21

Annotation of 311 Admission Summaries of the ICU Corpus

Annotation of 311 Admission Summaries of the ICU Corpus. Yefeng Wang. Aim. Create evaluation data for SNOMED CT concept matching performance. Create training data for machine learning systems. Rule-based systems has low recall Difficult to tune parameter, building the rules

Download Presentation

Annotation of 311 Admission Summaries of the ICU Corpus

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Annotation of 311 Admission Summaries of the ICU Corpus Yefeng Wang

  2. Aim • Create evaluation data for SNOMED CT concept matching performance. • Create training data for machine learning systems. • Rule-based systems has low recall • Difficult to tune parameter, building the rules • Machine learning system is the state of art • No such annotated data available yet.

  3. Existing Corpora • Most of the existing corpora are in biomedical domain • GENIA (2000 abstracts from MEDLINE) • PennBioIE (2300 MEDLINE abstracts) • Only a few are from clinical domain • Ogren et al., (clinical condition only) • Chapman et al., (clinical condition only) • CLEF, (semantically annotation, formal report)

  4. Selection of Data • Clinical notes were from 311 patients’ admission summaries. • One note per patient • Admission notes were used for annotation • Semi Structured, Variety of information • Chief Complaint • Background • History of Presented illness • Medication • Examination • Observation in Nursing Notes • Social • Other summaries (Echo reports, Surgical reports, etc)

  5. The Annotation Task • Concept Annotation • Annotate semantic category of medical concepts • Categories were based on SNOMED CT • Relation Annotation • Relationships between concepts. • Inter-term relation • Relationship between two separate concepts • Intra-term relation • Relationship between atomic concepts within a composite concept (Post-coordination).

  6. An Example Note

  7. Annotation Schema

  8. Development of Guidelines • Iterative Approach • 10 reports were annotated jointly by two annotators. • Discussion, • Development of initial guidelines • 25 reports were used for iterative refinement of guidelines • Annotate separately • 5 documents for each iteration • New examples, rules were added into annotation guidelines if necessary

  9. Annotation Agreement • Inter-Annotator Agreement were calculated during each development cycle. • F1- is used for calculation • Harmonic mean of recall and precision • Precision = # correct annotation / # annotation • Recall = # correct annotation / # existing concepts • Repeat development process until the annotator agreement reach a threshold of 90%. • The guidelines then are finalised, no more new rules will be added into the guidelines. • Differences resolved by a third annotator to make a gold standard corpus.

  10. IAA for the development cycle

  11. IAA for the whole corpus (311)

  12. Concept Frequency

  13. Comparison to other corpus • Comparison to corpus in newswire, biomedical, science (astronomy) domain. • Available corpus MUC, GENIA, ASTRO

  14. Concept Identification Result • 279 documents for training • 32 documents for testing • 4656 tokens, 1218 concepts • Rule-based system (TTSCT) • Use Conditional Random Fields CRF++ as the learner. • Evaluate using CONLL 2000 evaluation script.

  15. Concept Matcher Performance

  16. Machine Learning Results

  17. Inter Relation Annotation • Annotate relationship between concepts • Inter-concept relations • Relationship between two outermost concepts • CXR in ED bilateralmid- lower zoneopacification

  18. Intra-Concept Relations • Relations between inner concepts and outermost concepts • Term decomposition • Rgroinabscess

  19. Relation Types

  20. Inter + Intra Concept Relationships • Hemicolectomy and formation of ileostomy for bowel obstruction

  21. Relation Network

More Related