1 / 48

Semantic Parsing

Explore parsing through Lambda DCS and semantic functions to tackle the search problem for efficient question answering. Utilize fixed-order parsing and imitation learning approaches.

loganm
Download Presentation

Semantic Parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Parsing Via imitation learning

  2. Outline • Goal - Question Answering • Parsing via Lambda DCS and Semantic Functions • The Search Problem • Fixed Order Parsing • Imitation learning of agenda-based parsers

  3. Goal - Question Answering • Given a question in natural language, return the right answer. • Question: “What city was Abraham Lincoln born in?” • Answer: “Hodgenville” • We are given as inputs: • A) Grammar • B) Knowledge Base • C) Training Set

  4. Lambda DCS and Semantic Functions ROOT: Type.CityPlaceOfBirth.AbeLincoln Rules: ENTITY [LEX] BINARY[LEX] SETENTITYBINARY[JOIN] SETSETSET[INTERSECTION] INTERSECTION SET: PlaceOfBirth.AbeLincoln JOIN SET: Type.City ENTITY: AbeLincoln BINARY: PlaceOfBirth LEX LEX LEX What city abraham lincoln born in ?

  5. We will use the framework of simple to express our logical forms. • Unary predicates • e.g: The unary predicate Cities can be defined as: • Type.City = {New York, San Francissco, Tveria, … } • Binary Predicates: • e.g: the binary predicate Couples can be defined as: • Couples = {<Dana, Yossi>, <Amnon, Shoshi>, …} We will mostly use to express: • Entity: an unary predicate which is a singleton. E.g: AbeLincoln={Abraham Lincoln} • Set: unary predicate. E.g: Type.City

  6. We can also define a set of operators such as: Join and Intersection • Join– Given A property (e.gPlaceOfBirthOf) and an Entity (e.gAbeLincoln) we can perform a join operation. • The results will be: PlaceOfBirth.AbeLincoln. • The lambda calculus equivalent:

  7. We can also define a set of operators such as: Join and Intersection • Intersection – intersection of sets. • E.g: • The results will be the city of birth of Abraham Lincoln. • This is equivalent to:

  8. Inputs - Grammar • - terminals (words). • E.g: ={“place”, “birth”,..}, • - Categories. E.g: • = {BINARY, ENTITY, SET, ROOT} • - a set of rules. E.g: • SENTITYBINARY[JOIN] • ENTITY[LEX]

  9. Inputs – cont. • Training Set: • e.g: {(“What city was Abraham Lincoln born in?”, “Hodgenville”), …} • Knowledge Base: • e.gFreeBase. We will use the resulting logical form to query it.

  10. Outline • Goal - Question Answering • Parsing via Lambda DCS and Semantic Functions • The Search Problem • Fixed Order Parsing • Paper approach –Imitation learning of agenda-based parsers

  11. The Search Problem LEX(city)={SET:Type.City, SET: Type.Loc, …}, |LEX(city)|=362 LEX(lincoln)={ENTITY:AbeLincoln, ENTITY: USSLincoln, ENTITY: LincolnTown, …} |LEX(lincoln)| = 20 |LEX(abraham)| = 20 |LEX(in)| = 508 |LEX(born)| = 391 What city abraham lincoln born in ?

  12. The Search Problem Even for a simple question like “What city Abraham Lincoln was born in” we see that the number of root derivations, e.g the number of successfully constructed parsing trees might be > 1M. Can we efficiently search within the space of ROOT derivations the best parsing tree?

  13. Outline • Goal - Question Answering • Parsing via Lambda DCS and Semantic Functions • The Search Problem • Fixed Order Parsing • Imitation learning of agenda-based parsers

  14. Tackling the Search Problem • We will present approaches to tackle the search problem: • Fixed-Order parsing using beam search • Imitation Learning of Agenda-Based parser

  15. Fixed Order Parsing – High level For each sample : • Find ROOT derivation candidates using Beam Search • Update the weights of the model according to the label

  16. Parsing using Beam Search For Beam in size 2 S(SET:Type.Loc) = 3, S(SET:Type.City)=5 S(SET:Type.CitiesInCalifornia)=1 Rules: ENTITY [LEX] SETENTITYBINARY[JOIN] SETSETSET[INTERSECTION] Given a Score Function S SET: Type.City SET:Type.CitiesInCalifornia SET:Type.Loc What city abraham lincoln born in ?

  17. Parsing using Beam Search For Beam in size 2 Rules: ENTITY [LEX] SETENTITYBINARY[JOIN] SETSETSET[INTERSECTION] Given a Score Function S SET: Type.City SET: PlacesAbrahamVisited SET:Type.Loc What city abraham lincoln born in ?

  18. Parsing using Beam Search For Beam in size 2 S(ENTITY:AbeLincoln) = 3, S(ENTITY:LincolnTown)=5 S(ENTITY:USSLincoln)=1 Rules: ENTITY [LEX] SETENTITYBINARY[JOIN] SETSETSET[INTERSECTION] Given a Score Function S ENTITY: AbeLincoln ENTITY: LincolnTown ENTITY: USSLincoln What city abraham lincoln born in ?

  19. Parsing using Beam Search For Beam in size 2 Rules: ENTITY [LEX] SETENTITYBINARY[JOIN] SETSETSET[INTERSECTION] Given a Score Function S SET: PlaceAbrahamVisited What city abraham lincoln born in ?

  20. Parsing using Beam Search For Beam in size 2 Entity: T SET: T What city abraham lincoln born in ?

  21. Fixed Order Parsing - Highlevel For each sample : • Find ROOT derivation candidates using Beam Search • Update the weights of the model according to the label

  22. Fixed Order Parsing Model: Denote We want to learn the linear scoring function - features vector, extracted in some deterministic way - vector of weights we want to learn

  23. Fixed Order Parsing Training: we train the model online. For Denote Where

  24. Fixed Order Parsing Training: we train the model online. For Denote And maximize the following objective: Problem: we only have our beam search result Define:

  25. Fixed Order Parsing Training: we train the model online. For We want to maximize the following objective: We can set to be a continuous in [0,1] = 1 = 0

  26. Fixed Order Parsing – High level For each sample : • Find ROOT derivation candidates using Beam Search • Update the weights of the model according to the label

  27. Outline • Goal - Question Answering • Parsing via Lambda DCS and Semantic Functions • The Search Problem • Fixed Order Parsing • Imitation learning of agenda-based parsers

  28. Agenda Based Parsing Assuming we have an agenda, which prioritize all our possible actions in the derivations process, we will not need to calculate every cell in the Table H.

  29. Agenda Based Parsing - Initialization ENTITY: AbeLincoln ENTITY: LincolnHigh ENTITY: USSLincoln … BINARY: PlaceOfBirth ENTITY: YearOfBirth ENTITY: BirthOfANation … ENTITY: AbeLincoln ENTITY: Abraham … SET: Type.City SET: Type.Loc ENTITY: CityOfBoston …. Q = {SET: Type.City, …, ENTITY: AbeLincoln, …, ENTITY: LincolnHigh… , BINARY: PlaceOfBirth, …} LEX LEX LEX LEX What city abraham lincoln born in ?

  30. Agenda Based Parsing – Cont. SET: PlaceOfBirth.AbeLincolnType.City Q = { SET: Type.City, …, ENTITY: AbeLincoln, …, BINARY: PlaceOfBirth, … , …} Q = { SET: Type.City, …, ENTITY: AbeLincoln, …, BINARY: PlaceOfBirth, … , SET: PlaceOfBirth.AbeLincoln …} Q = { SET: Type.City, …, ENTITY: AbeLincoln, …, BINARY: PlaceOfBirth, … , SET: PlaceOfBirth.AbeLincoln …} Q = { SET: Type.City, …, ENTITY: AbeLincoln, …, BINARY: PlaceOfBirth, … , …} Q = { SET: Type.City, …, ENTITY: AbeLincoln, …, BINARY: PlaceOfBirth, … , SET: PlaceOfBirth.AbeLincoln SET: PlaceOfBirth.AbeLincolnType.City … } Q = { SET: Type.City, …, ENTITY: AbeLincoln, …, BINARY: PlaceOfBirth, … , SET: PlaceOfBirth.AbeLincoln, SET: PlaceOfBirth.AbeLincolnType.City, …. … } SET: PlaceOfBirth.AbeLincoln INTERSECTION JOIN BINARY: PlaceOfBirth SET: Type.City ENTITY: AbeLincoln LEX LEX LEX What city abraham lincoln born in ?

  31. Algorithm Once finished, we can just As the parsing result. But how do we learn an Agenda?

  32. Disadvantages of Fixed Order Parsing A disadvantage is that in order to get K root derivations, we have to calculate derivation for all spans and categories! In addition we learn on root derivations: And apply it on partial derivations!

  33. Reinforcement Learning State: History: Reward: )

  34. Reinforcement Learning • Our policy is a distribution of the available actions: • Thenin state we choose action according to our policy with 50% chance. 1/4 1/2 1/4

  35. Reinforcement Learning • Some definitions: • A history is simple a sequence of states and actions taken: • We define a distribution over histories:

  36. Reinforcement Learning • Some definitions: • A history is simple a sequence of states and actions taken: • We define a distribution over histories: • And maximize our objective:

  37. Reinforcement Learning • We define a distribution over histories: • And maximize our objective: The number of histories is exponentially big, Leading to slow convergence.. -> use online learning

  38. Imitation Learning • And maximize our objective: • If we find a “good” target history, we can perform the following onlineupdate: • - learning rate • a reward.

  39. Imitation Learning – finding • We will use our agenda based parsing algorithm to generate K root derivations using the policy . • We denote: • - the root derivation with the highest reward out of . • We will use to generate .

  40. Imitation Learning – local reweighting • - the root derivation with the highest score out of . • – indicate whether action a is a sub derivation of . • , for some . • And the probability of an history:

  41. Imitation Learning – history compression

  42. Imitation Learning – history compression • - the root derivation with the highest score out of . • For every history we define as the sequence of indices such that: • = 1 for every i. • The compressed history is:

  43. Imitation Learning - combining • - the root derivation with the highest score out of . • The reweighted distribution: • We will use our agenda based parsing algorithm to generate K root derivations using the policy. • And return such that c(h’) maximizes the reward.

  44. Imitation Learning - combining

  45. Imitation Learning - Results

  46. Imitation Learning - Results

  47. Imitation Learning - Results

  48. Imitation Learning - Results

More Related