640 likes | 788 Views
Processing Semantic Relations Across Textual Genres. Bryan Rink University of Texas at Dallas December 13, 2013. Outline. Introduction Supervised relation identification Unsupervised relation discovery Proposed work Conclusions. Motivation. We think about our world in terms of:
E N D
Processing Semantic Relations Across Textual Genres Bryan Rink University of Texas at Dallas December 13, 2013
Outline • Introduction • Supervised relation identification • Unsupervised relation discovery • Proposed work • Conclusions
Motivation • We think about our world in terms of: • Concepts (e.g., bank, afternoon, decision, nose) • Relations (e.g. Is-A, Part-Whole, Cause-Effect) • Powerful mental constructions for: • Representing knowledge about the world • Reasoning over that knowledge: • From Part-Whole(brain, Human) and Is-A(Socrates, Human) • We can reason that Part-Whole(brain, Socrates)
Representation and Reasoning • Large general knowledge bases exist: • WordNet, Wikipedia/DBpedia/Yago, ConceptNet, OpenCyc • Some domain specific knowledge bases exist: • Biomedical (UMLS, • Music (Musicbrainz) • Books (RAMEAU) • All of these are available in the standard RDF/OWL data model • Powerful reasoners exist for making inferences over data stored in RDF/OWL • Knowledge acquisition is still the most time consuming and difficult among these
Relation Extraction from Text • Relations between concepts are encoded explicitly or implicitly in many textual resources: • Encyclopedias, news articles, emails, medical records, academic articles, web pages • For example: • “The report found Firestone made mistakes in the production of the tires.” Product-Producer(tires, Firestone)
Outline • Introduction • Supervised relation identification • Unsupervised relation discovery • Proposed work • Conclusions
Supervised Relation Identification • SemEval-2010 Task 8 – “Multi-Way Classification of Semantic Relations Between Pairs of Nominals” • Given a sentence and two marked nominals • Determine the semantic relation and directionality of that relation between the nominals. • Example: A small piece of rock landed into the trunk • This contains an Entity-Destination(piece, trunk) relation: • The situation described in the sentence entails the fact that trunk is the destination of piece in the sense of piece moving (in a physical or abstract sense) toward trunk.
Observations • Three types of evidence useful for classifying relations: • Lexical/Contextual cues • “The seniors poured flourinto wax paper and threw the items as projectiles on freshmen during a morning pep rally” • Knowledge of the typical role of one nominal • “The rootball was in a cratethe size of a refrigerator, and some of the arms were over 12 feet tall. • Knowledge of a pre-existing relation between the nominals • “The Ca content in the cornflourhas also a strong dependence on the pericarp thickness.”
Approach • Use an SVM classifier to first determine the relation type • Each relation type then has its own SVM classifier to determine direction of the relation • All SVMs share same set of 45 feature types which fall into the following 8 categories: • Lexical/Contextual • Hypernyms from WordNet • Dependency parse • PropBank parse • FrameNet parse • Nominalization • Nominal similarity derived from Google N-Grams • TextRunner predicates
Lexical/Contextual Features • Words between the nominals are very important: • Number of tokens between the nominals is also helpful: • Product-Producer, Entity-Origin often have zero: “organ builder”, “Coconut oil” • Additional features for: • E1/E2 words, E1/E2 part of speech, Words before/after the nominals, Prefixes of words between • Sequence of word classes between the nominals: • Verb_Determiner, Preposition_Determiner, Preposition_Adjective_Adjective, etc.
Example Feature Values • Sentence: Forward [motion]E1of the vehicle through the air caused a [suction]E2on the road draft tube. • Feature values: • e1Word=motion, e2Word=suction • e1OrE2Word={motion, suction} • between={of, the, vehicle, through, the, air, caused, a} • posE1=NN, posE2=NN • posE1orE2=NN • posBetween=I_D_N_I_D_N_V_D • distance=8 • wordsOutside={Forward, on} • prefix5Between={air, cause, a, of, the, vehic, throu, the}
Parsing Features • Dependency Parse (Stanford parser) • Paths of length 1 from each nominal • Paths of length 2 between E1 and E2 • PropBank SRL Parse (ASSERT) • Predicate associated with both nominals • Number of tokens in the predicate • Hypernyms of predicate • Argument types of nominals • FrameNet SRL Parse (LTH) • Lemmas of frame trigger words, with and without part of speech • Also make use of VerbNet to generalize verbs from dependency and PropBank parses
Example Feature Values • Sentence: Forward [motion]E1of the vehicle through the air caused a [suction]E2on the road draft tube. • Dependency • <E1>nsubjcauseddobj<E2> • <E1>nsubjvn:27dobj<E2> • VerbNet/Levin class 27 is the class of engender verbs such as: cause, spawn, generate, etc. • This feature value indicates that E1 is the subject of an engender verb, and the direct object is E2 • PropBank • Hypernyms of the predicate: cause#v#1, create#v#1
Nominal Role Affiliation Features • Sometimes context is not enough and we must use background knowledge about the nominals • Consider the nominal: writer • Knowing that a writer is a person increases the likelihood that the nominal will act as a Producer or an Agency • Use WordNet hypernyms for the nominal’s sense determined by SenseLearner • Additionally, writer nominalizes the verb write, which is classified by Levin as a “Creation and Transformation” verb. • Most likely to act as a Producer • Use NomLex-Plus to determine the verb being nominalized and retrieve the Levin class from VerbNet
Google N-Grams for Nominal Role Affiliation • Semantically-similar nominals should participate in the same roles • They should also occur in similar contexts in a large corpus • Using Google 5-grams, the 1,000 most frequent words appearing in the context of a nominal are collected • Using Jaccard similarity on those context words, the 4 nearest neighbor nominals are determined, and used as a feature • Also, determine the role most frequently associated with those neighbors
Example Values for Google N-Grams Feature • Sentence 4739: As part of his wicked plan, Pete promotes Mickey and his pals into the [legion]E1 of [musketeers]E2and assigns them to guard Minnie. • Member-Collection(E2 , E1) • E1 nearest neighbors: legion, army, heroes, soldiers, world • Most frequent role: Collection • E2 nearest neighbors: musketeers , admirals, sentries, swordsmen, larks • Most frequent role: Member
Pre-existing Relation Features • Sometimes the context gives few clues about the relation • Can use knowledge about a context-independent relation between the nominals • TextRunner • A queryable database of Noun-Verb-Noun triples from a large corpus of web text • Plug in E1 and E2 as the nouns and query for predicates that occur between them
Example Feature Values for TextRunner Features • Sentence: Forward [motion]E1of the vehicle through the air caused a [suction]E2on the road draft tube. • E1 ____ E2 : may result from, to contact, created, moves, applies, causes, fall below, corresponds to which • E2 ____ E1 : including, are moved under, will cause, according to, are effected by, repeats, can match
Learning Curve 82.19 79.93 77.02 73.08 Training Size
Ablation Tests • All 255 (= 28 – 1) combinations of the 8 feature sets were evaluated by 10-fold cross validation Lexical is the single best feature set, Lexical+Hypernym is the best 2-feature set combination, etc.
Other Supervised Tasks Causal relations between events – FLAIRS 2010
Causal Relations Between Events • Discovered graph patterns that were then used as features in a supervised classifier • Example pattern: • “Under the agreement”, “In the affidavits”, etc.
Detecting Indications of Appendicitis in Radiology Reports • Submitted to AMIA TBI 2013
Resolving Coreference in Medical Records • i2b2 2011 and JAMIA 2012 • Approach • Based on Stanford Multi-Pass Sieve method • Added supervised learning by introducing features to each pass • Showed that creating a first pass which identifies all the mentions of the patient provides a competitive baseline
Extracting Relations Between Concepts in Medical Records • i2b2 2010 Shared Task and JAMIA 2011
Supervised Relations Conclusion • Identifying semantic relations requires going beyond contextual and lexical features • Use the fact that arguments sometimes have a high affinity for one of the semantic roles • Knowledge of pre-existing relations can aid classification when context is not enough
Outline • Introduction • Supervised relation identification • Unsupervised relation discovery • Proposed work • Conclusions
Relations in Electronic Medical Records • Medical records contain natural language narrative with very valuable information • Often in the form of a relation between medical treatments, tests, and problems • Example: • … with the [transfusion] and [IV Lasix] she did not go into [flash pulmonary edema] • Treatment-Improves-Problem relations: • (transfusion, flash pulmonary edema) • (IV Lasix, flash pulmonary edema)
Relations in Electronic Medical Records • Additional examples: • [Anemia] secondary to [blood loss]. • A causal relationship between problems • On [exam] , the patient looks well and lying down flat in her bed with no [acute distress] . • Relationship between a medical test (“exam”) and what it revealed (“acute distress”). • We consider both positive and negative findings.
Relations in Electronic Medical Records • Utility • Detected relations can aid information retrieval • Automated systems which review patient records for unusual circumstances • Drugs prescribed despite previous allergy • Tests and treatments never performed despite recommendation
Relations in Electronic Medical Records • Unsupervised detection of relations • No need for large annotation efforts • Easily adaptable to new hospitals, doctors, medical domains • Does not require a pre-defined set of relation types • Discover relations actually present in the data, not what the annotator thinks is present • Relations can be informed by very large corpora
Unsupervised Relation Discovery • Assumptions: • Relations exist between entities in text • Those relations are often triggered by contextual words: trigger words • Secondary to, improved, revealed, caused • Entities in relations belong to a small set of semantic classes • Anemia, heart failure, edema: problems • Exam, CT scan, blood pressure: tests • Entities near each other in text are more likely to have a relation
Unsupervised Relation Discovery • Latent Dirichlet Allocation baseline • Assume entities have already been identified • Form pseudo-documents for every consecutive pair of entities: • Words from first entity • Words between the entities • Words from second entity • Example: • If she has evidence of [neuropathy] then we would consider a [nerve biopsy] • Pseudo-document: {neuropathy, then, we, would, consider, a, nerve, biopsy}
Unsupervised Relation Discovery • These pseudo-documents lead LDA to form clusters such as:
Unsupervised Relation Discovery • Clusters formed by LDA • Some good trigger words • Many stop words as well • No differentiation between: • Words in first argument • Words between the arguments • Words in second argument • Can do a better job • By better modeling the linguistic phenomenon
Relation Discovery Model (RDM) • Three observable variables: • w1 : Token from the first argument • wc : Context word (between the arguments) • w2 : Tokens from the second argument • Example: • Recent [chest x-ray] shows [resolving right lower lobe pneumonia] . • w1: {chest, x-ray} • wc: {shows} • w2: {resolving, right, lower, lobe, pneumonia}
Relation Discovery Model (RDM) • In RDM: • A relation type (tr) is generated • Context words (wc) are generated from: • Relation type-specific word distribution (showed, secondary, etc.); or • General word distribution (she, patient, hospital) • Relation type-specific semantic classes for the arguments are generated • e.g. a problem-causes-problem relation would be unlikely to generate a test or a treatment class • Argument words (w1, w2) are generated from argument class-specific word distributions • “pneumonia”, “anemia”, “neuropathy” from a problem class
Relation Discovery Model (RDM) • Graphical model:
Experimental Setup • Dataset • 349 medical records from 4 hospitals • Annotated with: • Entities: problems, treatments, tests • Relations: Used to evaluate our unsupervised approach • Treatment-Addresses-Problem • Treatment-Causes-Problem • Treatment-Improves-Problem • Treatment-Worsens-Problem • Treatment-Not-Administered-Due-To-Problem • Test-Reveals-Problem • Test-Conducted-For-Problem • Problem-Indicates-Problem
Results • Trigger word clusters formed by the RDM:
Results • Instances of “connected problems” Last example is actually a Treatment-Administered-For-Problem
Results • Instances of “Test showed”
Results • Instances of “prescription”
Results • Instances of “prescription 2”
Results • Discovered Argument Classes
Evaluation • Two versions of the data: • DS1: Consecutive pairs of entities which have a manually identified relation between them • DS2: All consecutive pairs of entities • Train/Test sets: • Train: 349 records, with 5,264 manually annotated relations • Test: 477 records, with 9,069 manually annotated relations
Evaluation • Evaluation metrics • NMI: Normalized Mutual Information • An information-theoretic measure of how well two clusterings match • F measure: • Computed based on the cluster precision and cluster recall • Each cluster is paired with the cluster which maximizes the score