30 likes | 41 Views
Explore the concept of Two-Stage P-R Maximization and Error Correction in the context of MinErr Classifier for identifying new words or compounds. Understand the decision rules and advantages of this approach in achieving precision and recall targets.
E N D
Two-Stage P-R Maximization(Errata- Lexicon+MaxPR -10 @2017/04/26) • Why two stage? • No simple analytical decision rules that are capable of achieving any user-specified criterion function of precision and recall • … • Error: • If mimimum error => • Correction: • If minimum error => • or Jing-Shin Chang, EE, National Tsing-Hua University
MinErr Classifier: Two-Class Classifier forIdentifying New Words or Compound Words • Input: n-grams (n-word compounds, n-character words) in the text corpus • Output: assign a class label ("word" or "non-word") to each n-gram • Classifier: a log-likelihood ratio (LLR) tester (minimum error classifier) • Decision Rules: • Advantage: ensure minimum classification error (with 0 =0) if the distributions are known. Jing-Shin Chang, EE, National Tsing-Hua University
Filter: Two-Class Classifier (Log-Likelihood Ratio Ranking Module) • Input: n-grams in the unsegmented text corpus • Output: assign a class label ("word" or "non-word") to each n-gram • Classifier: a log-likelihood ratio (LLR) tester (minimum error classifier) • Decision Rules: • Advantage: ensure minimum classification error (with 0 =0) if the distributions are known. Jing-Shin Chang, EE, National Tsing-Hua University