1 / 14

Conditional Random Fields

Conditional Random Fields. A form of discriminative modelling Has been used successfully in various domains such as part of speech tagging and other Natural Language Processing tasks Processes evidence bottom-up Combines multiple features of the data

nitza
Download Presentation

Conditional Random Fields

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Conditional Random Fields • A form of discriminative modelling • Has been used successfully in various domains such as part of speech tagging and other Natural Language Processing tasks • Processes evidence bottom-up • Combines multiple features of the data • Builds the probability P( sequence | data)

  2. Transition functions add associations between transitions from one label to another State functions help determine the identity of the state Conditional Random Fields /k/ /k/ /iy/ /iy/ /iy/ • CRFs are based on the idea of Markov Random Fields • Modelled as an undirected graph connecting labels with observations • Observations in a CRF are not modelled as random variables X X X X X

  3. State Feature Weight λ=10 One possible weight value for this state feature (Strong) Transition Feature Weight μ=4 One possible weight value for this transition feature State Feature Function f([x is stop], /t/) One possible state feature function For our attributes and labels Transition Feature Function g(x, /iy/,/k/) One possible transition feature function Indicates /k/ followed by /iy/ Conditional Random Fields • Hammersley-Clifford Theorem states that a random field is an MRF iff it can be described in the above form • The exponential is the sum of the clique potentials of the undirected graph

  4. Conditional Random Fields • Conceptual Overview • Each attribute of the data we are trying to model fits into a feature function that associates the attribute and a possible label • A positive value if the attribute appears in the data • A zero value if the attribute is not in the data • Each feature function carries a weight that gives the strength of that feature function for the proposed label • High positive weights indicate a good association between the feature and the proposed label • High negative weights indicate a negative association between the feature and the proposed label • Weights close to zero indicate the feature has little or no impact on the identity of the label

  5. Experimental Setup • Attribute Detectors • ICSI QuickNet Neural Networks • Two different types of attributes • Phonological feature detectors • Place, Manner, Voicing, Vowel Height, Backness, etc. • Features are grouped into eight classes, with each class having a variable number of possible values based on the IPA phonetic chart • Phone detectors • Neural networks output based on the phone labels – one output per label • Classifiers were applied to 2960 utterances from the TIMIT training set

  6. Experimental Setup • Output from the Neural Nets are themselves treated as feature functions for the observed sequence – each attribute/label combination gives us a value for one feature function • Note that this makes the feature functions non-binary features.

  7. Experiment 1 • Goal: Implement a Conditional Random Field Model on ASAT-style phonological feature data • Perform phone recognition • Compare results to those obtained via a Tandem HMM system

  8. Experiment 1 - Results • CRF system trained on monophones with these features achieves accuracy superior to HMM on monophones • CRF comes close to achieving HMM triphone accuracy

  9. Experiment 2 • Goals: • Apply CRF model to phone classifier data • Apply CRF model to combined phonological feature classifier data and phone classifier data • Perform phone recognition • Compare results to those obtained via a Tandem HMM system

  10. Experiment 2 - Results Note that Tandem HMM result is best result with only top 39 features following a principal components analysis

  11. Experiment 3 • Goal: • Previous CRF experiments used phone posteriors for CRF, and linear outputs transformed via a Karhunen-Loeve (KL) transform for the HMM sytem • This transformation is needed to improve the HMM performance through decorellation of inputs • Using the same linear outputs as the HMM system, do our results change?

  12. Experiment 3 - Results Also shown – Adding both feature sets together and giving the system supposedly redundant information leads to a gain in accuracy

  13. Experiment 4 • Goal: • Previous CRF experiments did not allow for realignment of the training labels • Boundaries for labels provided by TIMIT hand transcribers used throughout training • HMM systems allowed to shift boundaries during EM learning • If we allow for realignment in our training process, can we improve the CRF results?

  14. Experiment 4 - Results Allowing realignment gives accuracy results for a monophone trained CRF that are superior to a triphone trained HMM, with fewer parameters

More Related