300 likes | 462 Views
Dialogue Modelling. Milica Ga š i ć. Dialogue Systems Group. Why are current methods poor?. Dialogue as a Partially Observable Markov Decision Process ( POMDP ). a t. s t. s t+1. State is unobservable and depends on the previous state and action:
E N D
Dialogue Modelling MilicaGašić Dialogue Systems Group
Dialogue as a Partially Observable Markov Decision Process (POMDP) at st st+1 State is unobservable and depends on the previous state and action: P(st+1|st,at) – the transition probability State depends on a noisy observation P(st|ot) -- the observation probability ot ot+1 rt • Action selection (policy) is based on the distribution over all states at every time step t – belief state b(st)
Belief propagation • Probabilities conditional on the observations • Interested in the marginal probabilities p(x|D), D={Da,Db} Da Db x
Belief propagation • Split Dbfurther into Dc and Dd Dc Da Db x Dd
Belief propagation Db b Dc c Da a
Belief propagation Db Da a b
How to track belief state? at st st+1 ot ot+1 rt
Belief state tracking at Requires summation over every dialogue state!!! Requires summation over all possible states at every dialogue turn – intractable!!! st st+1 ot ot+1 rt
Dialogue state factorisation • Decompose the sate into conditionally independent elements: at gt gt+1 dt dt+1 st user goal ut ut+1 user action rt dialogue history ot+1 ot
Belief update at Requires summation over all possible goals– intractable!!! gt gt+1 dt dt+1 ut ut+1 rt ot+1 ot Requires summation over all possible histories and user actions– intractable!!!
Hidden Information State system – dialogue acts Is there um maybe a cheap place in the centre of town please? inform ( pricerange = cheap, area = centre) dialogue act type semantics slots and values inform request confirm … type=restaurant food=Chinese …
Hidden Information State system – belief update • Only the user acts from the N-best Iist • Dialogue histories take a small number of values • Goals are grouped into partitions • All probabilities are handcrafted
Dialogue history in the HIS system • Dialogue history ideally represent everything that happened • History states: system informed, user informed, user requested, system requested for each concept in the dialogue • either 1 or 0 and defined by a finite state automaton
HIS partitions • Represent group of (most probable) goals • Dynamically built during the dialogue • is set to a high value if gt+1 is in line with gt and at, otherwise a small value
HIS partitions --example System: How may I help you? request(task) User: I’d like a restaurant in the centre. inform(entity=venue, type=restaurant, area=centre) entity entity entity ! venue venue type area venue type area !restaurant !central !restaurant central entity entity entity=venue venue type area type=restaurant venue type area area=central restaurant !central restaurant central
Pruning entity entity entity ! venue venue type area venue type area !restaurant !central !restaurant central entity entity=venue 0.9 entity venue type area venue type area type=restaurant 0.2 restaurant !central restaurant central area=central 0.5
Hidden Information State systems Any limitations?
Bayesian network model for dialogue at gt+1area gt gt+1 gt+1food gtfood gtarea dt dt+1 dtarea dt+1food dt+1area dtfood ut ut+1 ut+1food ut+1food utarea utfood rt ot+1 ot
Belief tracking • For each node x • Start on one side, and keep getting p(x|Da) • Then start on the other ends and keep getting p(Db|x) • To get a marginal simply multiply these
Bayesian network model for dialogue at gt+1food gt+1area gtfood gtarea dtarea dt+1area dtfood dt+1food ut+1food utarea ut+1food utfood rt ot+1 ot θ
Training policy using different parameters • Policy trained using reinforcement learning (explained in next lecture) • Examined on different errors in the user input • Average reward