420 likes | 545 Views
Adapting Text instead of the Model : An Open Domain Approach. Gourab Kundu, Dan Roth University of Illinois at Urbana-Champaign. Motivating Example # 1. predicate. Semantic role. Original Sentence. Wrong. Scotty gazed at ugly gray slums . AM-LOC. Transformed Sentence.
E N D
Adapting Text instead of the Model : An Open Domain Approach Gourab Kundu, Dan Roth University of Illinois at Urbana-Champaign
Motivating Example #1 predicate Semantic role Original Sentence Wrong Scotty gazed at ugly gray slums . AM-LOC Transformed Sentence Scotty looked at ugly gray slums . Correct! A1
Motivating Example #2 AM-TMP Predicate Wrong Original Sentence He was discharged from the hospital after a two-day checkup and he and his parents had what Mr. Mckinley described as a “celebration lunch” in the campus.
Motivating Example #2 Correct! Predicate AM-TMP Transformed Sentence He was discharged from the hospital after a two-day examination and he and his parents had what Mr. Mckinley described as a “celebration lunch” in the campus.
Research Question Can text perturbation be done in an automatic way to yield better NLP analysis? We study this question in the context of semantic role labeling. We focus on improving the performance of SRL on a different domain
Outline Overview of Domain Adaptation Overview of Adaptation Using Transformations (ADUT) Transformation Functions Combination Strategy Experimental Results Conclusion
Domain Adaptation • Models trained on one domain perform significantly worse on another domain • Semantic Role Labeling: WSJ domain (76%), Fiction domain (65%) • Important Problem for wide scale NLP • Adaptation is a problem for many tasks of NLP • There are many different domains where natural language varies • Labeling is expensive and time consuming
Current Approaches to Domain Adaptation • ChelbaAc04, Adaptation of a maximum entropy capitalizer: Little data can help a lot • Daume07, Frustratingly Easy domain adaptation • FinkelMa09, Hierarchical Bayesian domain adaptation • BlitzerMcPe06, Domain Adaptation with Structural Correspondence Learning • HuangYa09, Distributional Representations for Handling Sparsity in Supervised Sequence Labeling • JiangZh07, Instance Weighting for Domain Adaptation in NLP • ChangCoRo10, The necessity of combining adaptation methods • Labeled Adaptation • Uses labeled data from new domain • Unlabeled Adaptation • Uses unlabeled data from new domain • Combined Adaptation • Combines labeled and unlabeled data
Limitations (Retraining takes time) Retrain NLP Tool 1 NLP Tool 2 NLP Tool N Source Domain Unlabeled Data Target Domain Unlabeled Data Model SRL: 20 hours Retrain • Limitations: • Need to retrain the model -- can take a long time
Limitations (Some tools are hard to retrain) No option for retraining NLP Tool 1 NLP Tool 2 NLP Tool N Source Domain Unlabeled Data Target Domain Unlabeled Data Model • Limitations: • Need to retrain other people’s tools -- may need implementation
Limitations (Insufficient Unlabeled Data) May not be sufficient NLP Tool 1 NLP Tool 2 NLP Tool N Source Domain Unlabeled Data Target Domain Unlabeled Data Model • Limitations: • Need significant unlabeled data -- may not be available (e.g. website)
Outline Overview of Domain Adaptation Overview of Adaptation Using Transformations (ADUT) Transformation Functions Combination Strategy Experimental Results Conclusion
ADaptation Using Transformations (ADUT) Tool Transformed Sentences Model Outputs Tool o1 t1 t2 o2 Model Sentence s Output o Transformation Module Combination Module … … tk ok Old System Traditional Approach: Adapt model for the new text Our Approach: Adapt text for the old model
Transformation Functions • Definition: A Function that maps an instance to a set of instances • Example: Replacement of a word with synonyms that are common in training data • Properties: • Label (Semantic role) Preserving • Output examples are more likely to appear in Old Domain than input example
Categorization of Transformation Functions • Resource Based Transformation • Uses resources and prior knowledge • Learned Transformations • Learned from training data
Resource Based Transformation Replacement of Infrequent Predicate Replacement/Removal of Quoted String Replacement of Unknown Word (Word Cluster, WordNet) Sentence Simplification
Replacement of Infrequent Predicate (VerbNet) Input Sentence Scotty gazed at ugly gray slums . Transformed Sentence Scotty looked at ugly gray slums . Intuition: Model makes better prediction over frequent predicates.
Replacement/Removal of Quoted String Input Sentence “We just sit quiet” , he said . Transformed Sentences We just sit quiet. He said, “We just sit quiet”. He said, “This is good”. Intuition: Parser works better on simplified quoted sentences.
Replacing Unknown Word(Word Cluster, WordNet) Input Sentence He was released after a two-day checkup. Transformed Sentence He was released after a two-day examination. Intuition: Parser & Model works better on known words.
Sentence Simplification (1) Input Sentence The science teacher and the students discussed the issue at the classroom . Delete PP Transformed Sentence The science teacher and the students discussed the issue. Intuition: Parser & Model work better on simplified sentences.
Sentence Simplification (2) Input Sentence Simplify NP The science teacher and the students discussed the issue. Transformed Sentence The teacher discussed the issue.
Learned Transformation Rules Input Sentence Mr. Mckinley was entitled to a discount . • pattern p=[-2,NP,][-1,AUX,][1,,to] -2 -1 0 1 2. PP, to AUX, was NP, Mckinley A2 ` • Motivation: • Identify a specific context in the input sentence • Transfer the candidate argument to a simpler context in which the SRL is more robust 22 22
Context Component of Rules Input Sentence Mr. Mckinley was entitled to a discount . -1 0 1 2. -2 • Rule: • predicate p=entitle • pattern p=[-2,NP,][-1,AUX,][1,,to] • Location of Source Phrase ns=-2 • Replacement Sentence st=“But he did not sing.” • Location of Replacement Phrase nt=-3 • Label Correspondence function f={(A0,A2),(Ai,Ai, i0)} 23 23
Replacement Component of Rules Replacement Sentence But he did not sing . -4 -3 -2 -1 0 1 • Motivation: • Rule: • predicate p=entitle • pattern p=[-2,NP,][-1,AUX,][1,,to] • Location of Source Phrase ns=-2 • Replacement Sentence st=“But he did not sing.” • Location of Replacement Phrase nt=-3 • Label Correspondence function f={(A0,A2),(Ai,Ai, i0)}
Semantic Role mapping component of Rule Input Sentence Transformed Sentence Replacement Sentence But he did not sing . Mr. Mckinley was entitled to a discount . -2 -1 0 1 2 -4 -3 -2 -1 0 1 A2 Apply SRL System Gold Annotation A0 • Rule: • predicate p=entitle • pattern p=[-2,NP,][-1,AUX,][1,,to] • Location of Source Phrase ns=-2 • Replacement Sentence st=“But he did not sing.” • Location of Replacement Phrase nt=-3 • Label Correspondence function f={(A0,A2),(Ai,Ai, i0)}
Transforming a sentence by using rules R is the set of rules, learned from training data for each phrase p in input sentence s for each rule τЄ R if τ applies to p sentence t = transform(τ, p) r = semantic role of p in t using SRL model semantic role of p in s = map(τ, r)
Learning Transformation Rules • Input: Predicate p, Semantic role r • RGet Initial Rules (p, r) • repeat • S Expand Rules (R) • Sort R ∪ S based on accuracy • R Top rules in R ∪ S
Learning Transformation Rules • Input: Predicate p, Semantic role r • RGet Initial Rules (p, r) • repeat • S Expand Rules (R) • Sort R ∪ S based on accuracy • R Top rules in R ∪ S
Get Initial Rules (entitle, A2) Replacement Sentence Mr. Mckinley was entitled to a discount . But he did not sing . A2 Replacement Sentence I asked the man .
Learning Transformation Rules • Input: Predicate p, Semantic role r • RGet Initial Rules (p, r) • repeat • S Expand Rules (R) • Sort R ∪ S based on accuracy • R Top rules in R ∪ S
Expand Rule (𝜏) • Rule : • st = “But he did not sing .” • nt = -3 • p = ask • sp = [-1,NP,I][0,VBD,asked][1,NP,man] • ns = 1 • f = {(A0,A2), (Ai,Ai, i0)} Does not apply He asked the man Applies • Neighbor Rule of : • st =.st • nt = • p =.p • sp=[-1,NP,][0,VBD,asked][1,NP,man] • ns =.ns • f =.f
Learning Transformation Rules • Input: Predicate p, Semantic role r • RGet Initial Rules (p, r) • repeat • S Expand Rules (R) • Sort R ∪ S based on accuracy • R Top rules in R ∪ S
Calculate Accuracy (𝜏) Input Sentence Transformed Sentence Replacement Sentence he But did not sing . Mr. Mckinley was entitled to a discount . -2 -2 -1 0 1 2. -4 -3 -2 -1 0 1 Gold Annotation Apply SRL System A0 A2 A2 = f (A0) Correct! Example: A rule is correct 33
Calculate Accuracy (𝜏) Transformed Sentence Input Sentence Replacement Sentence But he did not sing . The movie was entitled a big success . -2 -2 -1 0 1 2. -4 -3 -2 -1 0 1 Gold Annotation Apply SRL System A0 A1 A2 = f (A0) Wrong Example: A rule makes a mistake 34
Outline Overview of Domain Adaptation Overview of Adaptation Using Transformation (ADUT) Transformation Functions Combination Strategy Experimental Results Conclusion
Combination using Integer Linear Programming Transformed Sentence 1 Scotty gazed at ugly gray slums . Transformed Sentence 2 Scotty looked at ugly gray slums . Average • Step1: Compute distribution of scores over labels for argument candidates • Our SRL system classifies each phrase as a semantic role • The system assigns a probability distribution over semantic roles for each argument • For same argument in different sentences, compute the average 36
Inference via Integer Linear Programming Example of a Constraint: Two arguments can not overlap Input Sentence Scotty gazed at ugly gray slums. • Goal: Find maximum likely semantic role assignment to all arguments without violating the constraints • Solve an ILP 37
Outline Overview of Domain Adaptation Overview of Adaptation Using Transformations (ADUT) Transformation Functions Combination Strategy Experimental Results Conclusion
Conclusion • Current Work • We suggested a framework for adapting text to yield better SRL analysis • We showed that adaptation is possible without retraining and unlabeled data • We showed that simple transformations yield 13% error reduction for SRL • Future Work: • Applying framework to other domains and tasks • Using unlabeled data to improve transformations Thank You.