130 likes | 361 Views
CISC 889: Statistical Approaches to NLP Spring 2003. Dialogue Act Tagging Using TBL. Sachin Kamboj. August 8, 2014. Outline. Dialogue Acts Transformation Based Learning (TBL) Introduction Training Phase Example Motivation for use of TBL in Dialogue Limitations of TBL
E N D
CISC 889: Statistical Approaches to NLP Spring 2003 Dialogue Act TaggingUsing TBL Sachin Kamboj August 8, 2014
Outline • Dialogue Acts • Transformation Based Learning (TBL) • Introduction • Training Phase • Example • Motivation for use of TBL in Dialogue • Limitations of TBL • Monte Carlo Version of TBL • Other Extensions • References
Dialogue Acts • For a computer system to understand (and participate in) a dialogue, the system should know the intentions of the speaker. • Defined as: A concise abstraction of the intentional function of a speaker.
Transformation-based Learning • Uses machine learning to generate a sequence of rules to use in tagging data • The first rules are general, making sweeping generalization and several errors • Subsequent rules are more specific and usually correct specific errors
START Label each utterance with an initial tag For each incorrect tag, generate ALL rules to fix this tag Compute a score for each rule Find the highest Scoring rule Repeat? Apply this rule to the corpus Yes No END TBL: Training Phase
TBL: Motivation for use in Dialogue • Computing Dialogue Acts is similar to computing POS tags: • POS tags depend on surrounding words • Dialogue Acts depend on surrounding utterances. • Advantages over other ML methods used for Dialogue Act Tagging: • Generates intuitive rules: • Allows researchers to gain insights into discourse theory • Leverage learning: • Can use tags that have already been computed. • Does not overtrain
Limitations • TBL does not assign a confidence measure to the tags • TBL is highly dependent on the rule templates. • Rule templates need to be manually selected in advance • However, its difficult to find only and all the relevant templates. • Omissions would result in handicapping the system • Solution: Allow the system to learn which templates are useful • Allow an overly general set of templates • TBL is capable of discarding irrelevant rules…
Limitations (cont.) • Problem with the solution: • The system becomes intractable with more than 30 templates • The system must generate and evaluate O( ip(v + 1)(2d + 1)f ) rules, where: i = no of instances p = no of passes f = no of features d = max. feature distance v = avg. no. of distinct values for each feature
A Monte Carlo Version of TBL • Allows the system to consider a huge number of templates while still maintaining tractability • System does not perform an exhaustive search through the space of possible rules. • Instead only Rrandomly selected templates are instantiated at each pass • Hence, system only considers O ( ipR ) rules. • System still finds the best rules: • Because the system has many opportunities to find the best rules
Other Extensions • Committee-Based Sampling method to assign confidence measures • Automatic generation of “Cue Phases”
Reference Sources • Samuel, Ken and Carberry, Sandra and Vijay-Shanker, K. 1998. Dialogue Act Tagging with Transformation-Based Learning. In Proceedings of the 17th International Conference on Computational Linguistics and the 36th Annual Meeting of the Association for Computational Linguistics. Montreal, Quebec, Canada. 1150-1156. • Samuel, Ken and Carberry, Sandra and Vijay-Shanker, K. 1998. An Investigation of Transformation-Based Learning in Discourse. In Machine Learning: Proceedings of the Fifteenth International Conference. Madison, Wisconsin. 497-505. • Samuel, Ken and Carberry, Sandra and Vijay-Shanker, K. 1999. Automatically Selecting Useful Phrases for Dialogue Act Tagging. In Proceedings of the Fourth Conference of the Pacific Association for Computational Linguistics. Waterloo, Ontario, Canada. • Samuel, Ken and Carberry, Sandra and Vijay-Shanker, K. 1998. Computing Dialogue Acts from Features with Transformation-Based Learning. In Applying Machine Learning to Discourse Processing: Papers from the 1998 AAAI Spring Symposium. Stanford, California. 90-97.