Scott Duvall, Brett South, Stéphane Meystre

A Hands-on Introduction to Natural Language Processing in HealthcareAnnotation as a Central Task for Development of NLP SystemsMedInfo 2010 Congress September 11, 2010 Scott Duvall, Brett South, Stéphane Meystre

NLP applied to Clinical Documents • Detailed clinical information is often locked in free-text clinical note documents. • Applying Natural language Processing (NLP) methods to clinical free-text allows more detailed document level review • Clinical information can be used for DSS, Q/A, Research, Performance improvement, Surveillance etc… Annotation is a central task for evaluation of NLP systems used for Information Retrieval (IR) or Information Extraction (IE) tasks.

Why and When to Annotate? • Annotation as a Central Task: • Manually annotated corpora focus and clarify NLP system requirements. • Establish reference standard(s) to train and evaluate NLP tools applied to clinical texts for various tasks. • Provide data for NLP systems development (supervised learning) • Extraction rules may be created automatically or by hand. • Statistical models of text documents built by machine learning algorithms.

What to annotate and at what level? • Meta Information:specific document types, annotator, institution, clinic, authors, document type, author • Document level: Sections, headers, paragraphs, templates, or other document level assessments • Lexical: semantic categories of words. • Syntax:structures combined to produce sentences: • Words combine in well-defined structures POS (syntactic parse trees), grammatical level. • Semantics:meaning/interpretation combined • Individual word sense – combined to form meaningful sentences

What to annotate and at what level? • Pragmatics:relies on clinical inference/context affects the interpretation of meaning • Domain, report section, location within the section of a report, or other implicit information. • Discourse:links between annotated instances, or across sentences • Previous information affects the interpretation of the current information. • Includes referents (pronouns, definite or bridging clauses, time of events, coherence of sentences) • World knowledge:facts about the world at large and/or common sense These levels may differ depending on the specific use case, the clinical question, and goals of application for NLP.

Semantic annotation • Concepts (“markables”): types of information defined by the annotated instance level. • Use case dependent • Focus on noun phrases only? • Focus on specific semantic types (diagnoses, findings, treatments, procedures, etc…)? • Modifiers (“attributes”): information features • Negation, experiencer, temporality, certainty, change over time, severity, numeric values, anatomic locations, note section, modifiers, information quality, etc…?

Some jargon • Annotation guideline: • Defines what qualifies as a “markable” for a given use case, how annotated instances should be identified, and how/what particular attributes are associated with annotated instances. • - In other words…the rules of the game so to speak defining what information will be used to train and evaluate the performance of the NLP system. • Annotation schema: • Provides a logical representation of the annotation guideline.

Common annotation tasks Task 1 Task 2 Task 3

What is measured at the task level? • Estimate of reliability (task consistency): • IAA = matches/(matches+nonmatches) • Partial (spans of annotated instances overlap) • Exact (spans of annotated instances match exactly) • Measurement of Validity (task accuracy): • Recall = TP/TP+FN, Precision TP/TP+FP, • F-measure = [(1+β2 )(PR)] / (β2 P + R) • These metrics will be discussed in more depth in Part 2

Who should do annotation tasks? • Who: depends on use case, and annotation goals • For some use cases may need many annotators • Level of domain expertise (physicians, nurses, nurse practitioners, pharmacists, physician assistants, coders…and yes even graduate students). • Depends on the level of clinical inference required.

A commonly used approach

The annotation task? • Use Case: Focus on extracting as manyexplicitlymentioned diagnoses as possible from a collection of 75 discharge summaries selected from one of the i2b2 Challenge tasks. • Goals: • Illustrate level of difficulty involved with annotation tasks. • Demonstrate use of annotation guideline and schema to develop a reference standard. • Demonstrate calculation of evaluation metrics in terms of task consistency and accuracy (i.e. IAA, precision, recall F-measure).

Workshop annotation task • The good news (things we built for you): • We don’t expect you to infer clinical diagnoses (no discourse or linking of concepts across sentences). • We have already developed an annotation guideline and schema for this task. • Diagnoses are loosely based on semantic types from the UMLS:

Workshop annotation task • The bad news (or the challenge) : • One of the attributes we will identify is negation status. • e.g “No evidence of peripheral arterial disease”. • This task does have a certain level of difficulty, but will be a good demonstration of reference standard and practical application of NLP.

Protégé/Knowtator • What tool will be used? • For annotation tasks we will use the Knowtator plugin written for the Protégé knowledge representation system. • Knowtator facilitates annotation and adjudication tasks. • A final reference standard has been created and will be available to participants. • Concepts (“markables”) are called “classes”. • Modifiers (“attributes”) are called “slots”.

Hands-on component: • Install Protégé 3.3.1, Knowtator 1.9 available from: Protégé: http://protege.cim3.net/download/old-releases/3.3.1/basic Knowtator: http://knowtator.sourceforge.net • Review the annotation guideline and try using the Knowtator schema. • annotate the first 5 documents. • Don’t Panic! – Ask for help from any of the instructors, this is a hands-on exercise.

Thank you for your attention! • For more information: • Brett.South@hsc.utah.edu • Shuying.Shen@hsc.utah.edu • Scott.Duvall@hsc.utah.edu • Stephane.Meystre@hsc.utah.edu • TA: Chris.Leng@utah.edu

Scott Duvall, Brett South, Stéphane Meystre