410 likes | 554 Views
FSA and HMM. LING 572 Fei Xia 1/5/06. Outline. FSA HMM Relation between FSA and HMM. FSA. Definition of FSA. A FSA is Q: a finite set of states Σ : a finite set of input symbols I: the set of initial states F: the set of final states
E N D
FSA and HMM LING 572 Fei Xia 1/5/06
Outline • FSA • HMM • Relation between FSA and HMM
Definition of FSA A FSA is • Q: a finite set of states • Σ: a finite set of input symbols • I: the set of initial states • F: the set of final states • : the transition relation between states.
An example of FSA b a q0 q1
Definition of FST A FST is • Q: a finite set of states • Σ: a finite set of input symbols • Γ: a finite set of output symbols • I: the set of initial states • F: the set of final states • : the transition relation between states. FSA can be seen as a special case of FST
The extended transition relation is the smallest set such that • T transduces a string x into a string y if there exists a path from the initial state to a final state whose input is x and whose output is y:
b:y a:x q0 q1 An example of FST
Operations on FSTs • Union: • Concatenation: • Composition:
b:y a:x q0 q1 x:ε y:z q0 An example of composition operation
Probabilistic finite-state automata (PFA) • Informally, in a PFA, each arc is associated with a probability. • The probability of a path is the multiplication of the arcs on the path. • The probability of a string x is the sum of the probabilities of all the paths for x. • Tasks: • Given a string x, find the best path for x. • Given a string x, find the probability of x in a PFA. • Find the string with the highest probability in a PFA • …
Formal definition of PFA A PFA is • Q: a finite set of N states • Σ: a finite set of input symbols • I: Q R+ (initial-state probabilities) • F: Q R+ (final-state probabilities) • : the transition relation between states. • P: (transition probabilities)
Constraints on function: Probability of a string:
Consistency of a PFA Let A be a PFA. • Def: P(x | A) = the sum of all the valid paths for x in A. • Def: a valid path in A is a path for some string x with probability greater than 0. • Def: A is called consistent if • Def: a state of a PFA is useful if it appears in at least one valid path. • Proposition: a PFA is consistent if all its states are useful. Q1 of Hw1
b:0.8 a:1 q0:0 q1:0.2 An example of PFA I(q0)=1.0 I(q1)=0.0 P(abn)=0.2*0.8n
Weighted finite-state automata (WFA) • Each arc is associated with a weight. • “Sum” and “Multiplication” can be other meanings.
Two types of HMMs • State-emission HMM (Moore machine): • The emission probability depends only on the state (from-state or to-state). • Arc-emission HMM (Mealy machine): • The probability depends on (from-state, to-state) pair.
State-emission HMM … s1 s2 sN w1 w4 w1 w3 w5 w1 • Two kinds of parameters: • Transition probability: P(sj| si) • Output (Emission) probability: P(wk | si) • # of Parameters: O(NM+N2)
Arc-emission HMM w1 w2 w1 w1 w5 … sN s1 s2 w4 w3 Same kinds of parameters but the emission probabilities depend on both states: P(wk, sj| si) # of Parameters: O(N2M+N2).
Are the two types of HMMs equivalent? • For each state-emission HMM1, there is an arc-emission HMM2, such that for any sequence O, P(O|HMM1)=P(O|HMM2). • The reverse is also true. Q3 and Q4 of hw1.
Definition of arc-emission HMM • A HMM is a tuple : • A set of states S={s1, s2, …, sN}. • A set of output symbols Σ={w1, …, wM}. • Initial state probabilities • State transition prob: A={aij}. • Symbol emission prob: B={bijk} • State sequence: X1,n • Output sequence: O1,n
Constraints For any integer n and any HMM Q2 of hw1.
Properties of HMM • Limited horizon: • Time invariance: the probabilities do not change over time: • The states are hidden because we know the structure of the machine (i.e., S and Σ), but we don’t know which state sequences generate a particular output.
Applications of HMM • N-gram POS tagging • Bigram tagger: oi is a word, and si is a POS tag. • Trigram tagger: oi is a word, and si is ?? • Other tagging problems: • Word segmentation • Chunking • NE tagging • Punctuation predication • … • Other applications: ASR, ….
Three fundamental questions for HMMs • Finding the probability of an observation • Finding the best state sequence • Training: estimating parameters
(1) Finding the probability of the observation Forward probability: the probability of producing O1,t-1 while ending up in state si:
Calculating forward probability Initialization: Induction:
oT o1 o2 XT+1 … XT X1 X2 (2) Finding the best state sequence • Given the observation O1,T=o1…oT, find the state sequence X1,T+1=X1 … XT+1 that maximizes P(X1,T+1 | O1,T). Viterbi algorithm
Viterbi algorithm The probability of the best path that produces O1,t-1 while ending up in state si: Initialization: Induction: Modify it to allow epsilon emission: Q5 of hw1.
Summary of HMM • Two types of HMMs: state-emission and arc-emission HMM: • Properties: Markov assumption • Applications: POS-tagging, etc. • Finding the probability of an observation: forward probability • Decoding: Viterbi decoding
Relation between WFA and HMM • HMM can be seen as a special type of WFA. • Given an HMM, how to build an equivalent WFA?
Converting HMM into WFA Given an HMM , build a WFA such that. for any input sequence O, P(O|HMM)=P(O|WFA). • Build a WFA: add a final state and arcs to it • Show that there is a one-to-one mapping between the paths in HMM and the paths in WFA • Prove that the probabilities in HMM and in WFA are identical.
HMM WFA Need to create a new state (the final state) and add edges to it. The WFA is not a PFA.
A slightly different definition of HMM • A HMM is a tuple : • A set of states S={s1, s2, …, sN}. • A set of output symbols Σ={w1, …, wM}. • Initial state probabilities • State transition prob: A={aij}. • Symbol emission prob: B={bijk} • qf is the final state: there are no outcoming edges fromqf
Constraints For any HMM (under this new definition)
PFA HMM Need to add a new final state and edges to it
Project: Part 1 • Learn to use Carmel (a WFST package) • Use Carmel as an HMM Viterbi decoder for a trigram POS tagger. • The instruction will be handed out on 1/12, and the project is due on 1/19.
Summary • FSA • HMM • Relation between FSA and HMM • HMM (the common def) is a special case of WFA • HMM (a different def) is equivalent to PFA.