1 / 27

Starting from Scratch in Semantic Role Labeling

Starting from Scratch in Semantic Role Labeling. Michael Connor, Yael Gertner, Cynthia Fisher, Dan Roth. How do we acquire language?. Topid rivvo den marplox. The language-world mapping problem. “the world”. “the language”. [Topid rivvo den marplox.].

lynda
Download Presentation

Starting from Scratch in Semantic Role Labeling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Starting from Scratchin Semantic Role Labeling Michael Connor, Yael Gertner, Cynthia Fisher, Dan Roth

  2. How do we acquire language? • Topid rivvo den marplox.

  3. The language-world mapping problem “the world” “the language” [Topid rivvo den marplox.]

  4. Observe how words are distributed across situations Observe how words are distributed across situations Scene 1 Smur! Rivvo della frowler. Topid rivvo den marplox. Scene 3 Blert dor marplox, arno. Scene n Marplox dorinda blicket.

  5. [Johanna rivvo den sheep.] Structure-mapping: A proposed starting point for syntactic bootstrapping • Children can learn the meanings of some nouns via cross-situational observation alone [Fisher 1996, Gillette, Gleitman, Gleitman, & Lederer, 1999; Snedeker & Gleitman, 2005] • But how do they learn the meaning of verbs? • Sentences comprehension is grounded by the acquisition of an initial set of concrete nouns • These nouns yields a skeletal sentence structure — candidate arguments; cue to its semantic predicate—argument structure. • Represent sentence in an abstract form that permits generalization to new verbs Nouns identified

  6. Strong Predictions [Gertner & Fisher, 2006] • Test 21 month olds on assigning arguments with novel verbs • How order of nouns influences interpretation: Transitive & Intransitive Agent-first: The boy and the girl are daxing! Transitive: The boy is daxing the girl! Agent-last: The girl and the boy are daxing! Error disappears by 25 months preferential looking paradigm

  7. Current Project: BabySRL • Realistic Computational model for Syntactic Bootstrapping via Structure Mapping: • Verbs meanings are learned via their syntactic argument-taking roles • Semantic feedback to improve syntactic & meaning representation • Develop Semantic Role Labeling System (BabySRL) to experiment with theories of early language acquisition • SRL as minimal level language understanding • Determine who does what to whom. • Inputs and knowledge sources • Only those we can defend children have access to

  8. BabySRL: Key Components • Representation: • Theoretically motivated representation of the input • Shallow, abstract, sentence representation consisting of • # of nouns in the sentence • Noun Patterns (1st of two nouns) • Relative position of nouns and predicates • Learning: • Guided by knowledge kids have • Classify words by part-of-speech • Identify arguments and predicates • Determine the role arguments take

  9. BabySRL: Early Results [Connor et. al CoNLL’08, 09] • Fine grained experiments with how language is represented • Test different levels of representation • Primary focus on noun pattern (NPattern) feature • Hypothesis: number and order of nouns important • Once we know some nouns, can use them to represent structure • NPattern gives count and placement: • First of two, second of three, etc. • Alternative: Verb Position • Target argument is before or after verb • Key Finding: • NPattern reproduces errors in children • Promotes A0-A1 interpretation in transitive, but also intransitive sentences • Verb position does not make this error • Incorporating it recovers correct interpretation • But: Done with manually labeled data • Feedback varies

  10. BabySRL: Key Components • Representation: • Theoretically motivated representation of the input • Shallow, abstract, sentence representation consisting of • # of nouns in the sentence • Noun Patterns (1st of two nouns) • Relative position of nouns and predicates • Learning: • Guided only by knowledge kids have • Classify words by part-of-speech • Identify arguments and predicates • Determine the role arguments take

  11. This work: Minimally Supervised BabySRL • Goal: Unsupervised “parsing” for identifying arguments • Provide little prior knowledge & high level semantic feedback • Defensible from psycholinguistic evidence • Overview • Unsupervised Parsing • Identifying part-of-speech states • Argument Identification • Identify Argument States • Identify Predicate States • Argument Role Classification • Labeled Training using unsupervised arguments • Results and comparison to child experiments

  12. BabySRL Overview • Traditional Approach • Parse input • Identify Arguments • Classify Arguments • Global inference over Arguments • Each stage has its own classifier/knowledge source • Finely labeled training data throughout She always has salad . [NP][ VP ][ NP ] [ ][ V ][ ] [ A0] [ A1 ]

  13. BabySRL Overview • Unsupervised Approach • Unsupervised HMM • Identify Argument States • Classify Arguments • No Global inference • Labeled training: only for argument classifier • Rest is driven by simple background knowledge She always has salad . 46 48 26 74 2 N V N A0 A1

  14. Unsupervised Parsing • We want to generate a representation that permits generalization over word forms • Incorporate Distributional Similarity • Context Sensitive • Hidden Markov Model (HMM) • Simple model • Essentially provides Part of Speech information • Without names for states; we need to figure this out • Train on child directed speech • CHILDES repository • Around 1 million words, across multiple children

  15. Unsupervised Parsing (II) • Standard way to train unsupervised HMM • Simple EM produces uniform size clusters • Solution: Include priors for sparsity • Dirichlet prior (Variational Bayes, VB) • Replace this by psycholinguistically plausible knowledge • Knowledge of function words • Function and content words have different statistics • Evidence that even newborns can make this distinction • We don't use prosody, but it may provide this. • Technically: allocate a number of states to function words • Leave the rest to the rest of the words • Done before parameter estimation, can be combined with EM or VB learning: EM+Func, VB+Func

  16. Unsupervised Parsing Evaluation • Test as unsupervised POS on subset of hand corrected CHILDES data. • Incorporating function word pre-clustering allows both EM & VB to achieve the same performance with an order of magnitude fewer sentences EM: HMM trained with EM VB: HMM trained with Variational Bayes & Dirichlet Prior EM+Funct VB+Funct: Different training methods with function word pre-clustering Variance of Information Better Training Sentences

  17. Argument Identification • Now we have a parser that gives us state ( cluster) each word belongs to • Next: identify states that correspond to arguments & predicates • Knowledge: We provide list of frequent nouns • As few as 10 nouns covers 60% occurrences • Mostly pronouns • A lot of evidence that children know and recognize nouns early on • Algorithm: Those states that appear over half the time with known nouns treated as argument state • Assumes that nouns = arguments

  18. Argument Identification Knowledge: Frequent Nouns: You, it, I, what, he, me, ya, she, we, her, him, who, Ursula, Daddy, Fraser, baby, something, head, chair, lunch,… List of words that occurred with state 46 in CHILDES

  19. Predicate Identification • Nouns are concrete, can be identified • Predicates more difficult • Not learned easily via cross-situational observation • Structure-mapping account: sentence comprehension is grounded in the learning of an initial set of nouns • Verbs are identified based on their argument-taking behavior. • Algorithm: Identify predicates as those states that tend to appear with a given number of arguments • Assume one predicate per sentence

  20. Predicate Identification State 48: 1 argument - 0.2 2 argument - 0.4 3 argument – 0.3 ... State 26: 1 argument - 0.1 2 argument - 0.6 3 argument – 0.2 ...

  21. Argument Identification Results • Test compared to hand labeled argument boundaries on CHILDES directed speech • Vary number of seed nouns Implications on features’ quality Differences in parsing disappeared Argument Identification Good EM+Funct Predicate Identification Bad

  22. Finally: BabySRL Experiments • Given potential arguments & predicates, train argument classifier • Given abstract representation of argument and predicate, determine its role (Agent, Patient, etc) • To train, apply true labels to noisily identified arguments • Roles relative to predicate • Regularized Perceptron • Abstract Representations (features) considered: • Only depends on number and order of arguments • (1) Lexical: Target noun and target predicate • (2) NounPattern: first of two, second of three, etc. • She always has salad: `She` is first of two, `salad` is second of two • (3) VerbPosition representation: • `She` is before the verb, `salad` is after the verb

  23. BabySRL Experiments: Test Data • Unlike previous experiments (CHILDES), here we compare to psycholinguistic data • Evaluate on two-noun constructed sentences with novel verbs • Test two noun transitive vs. intransitive sentences • A krads B vs. A and B krads. • A and B filled with known nouns • Test generalization to unknown verb • Reproduces experiments on young children • At 21 months of age make mistake: interpreting A as agent, B as patient in both cases. • Hypothesize children (at this stage) represent sentences in terms of number and order of nouns, notposition of verb.

  24. Reproduces error on Intransitive. With noisy representation does not recover even when VerbPos available BabySRL Experiments Learns Agent First, Patient Second • NounPat features promote Agent-Patient (A0-A1) interpretation for both Transitive (correct) and Intransitive (incorrect). VerbPos pushes the intransitive case in the other direction. Works for Gold Training Transitive 10 Seed Nouns Intransitive %A0A1 %A0A1 Parsing Algorithm Parsing Algorithm

  25. Summary • BabySRL: a realistic computational model for verb meaning acquisition via the structure-mapping theory. • Representational Issues • Unsupervised learning driven by minimal plausible knowledge sources • Even with noisy parsing and argument identification, able to learn abstract rule: agent first, patient second. • Difficult identification of predicates harms usefulness of superior representation (VerbPos) • Reproduces errors seen in children • Next Steps: • Use correct high level semantic feedback to improve earlier identification and parsing decisions — improve VerbPos feature • Relax correctness of semantic feedback Thank You

  26. How do we acquire language? Balls The Dog kradz the balls Dog

  27. Even with VerbPosition available, with noisy parse make error on intransitive sentences. BabySRL Experiments Transitive Intransitive 10 Seed Nouns 365 Seed Nouns

More Related