A High Performance Semi-Supervised Learning Method for Text Chunking

A High Performance Semi-Supervised Learning Method for Text Chunking Authors: RieKubota Ando Tong Zhang

Structural learning. • Idea: • “What good classifiers are like” by learning from thousands of automatically generated auxiliary classification problems on unlabeled data. By doing so, the common predictive structure shared by the multiple classification problems can be discovered • Performance better than previous results

Boot Strapping Methods: • Co-Training • Expectation Maximization • Goal: • Create Learning Framework

Contribution of paper: • Design Novel Robust semi-supervised Method • Reporting higher performance

Standard Linear Prediction Model: • f(x)=wpow(T) x • w-> weight vector • K-way Classification: • Winner takes all • One predictor per class

Linear model for Structural Learning: f`((-),x)=wT`x + vT` (-)x , (-)(-)T=I (-) -> projection matrix I-> identity Matrix

Alternating Structure Optimization: • Fix ((-),{v`}), and find m predictors. • Fix m predictors {u`}and find ((-),{v`} ). • Iterate until a convergence criterion is met

Properties of Auxiliary Problem: • Automatic labeling • Relevancy

Semi-Supervised Learning Procedure: • Create training data Z~` for each l. • Compute (-) from training data through SVD-ASO. • Minimize the empirical risk on the labeled data

Auxiliary Problem Creation: • Unsupervised Strategy • Predict words • Partially-Supervised Strategy • Predict top k-choices of the classifier

Extension of the SVD-ASO Algorithm: • NLP applications has natural grouping • Perform localised optimization • Sub matrxix of structure matrix (-) • Regularise the non negative components

Baseline Algorithms: • Supervised classifier • Co-Training • Self-Training

Named Entity Chunking Experiment:

Results: • Refer Page 6 in pdf • Refer page 7 in pdf

Syntactic Chunking Experiment: • Refer page 7 in pdf • Refer page 8 in pdf

Conclusion: • Presented a novel semi-supervised method • Predictive low dimensional feature projection • Key is to create auxiliary problems automatically. • Risk is low and has potential gain

Queries???

Created a framework for carrying out possible new ideas By designing a variety of auxiliary problems SVD more info: http://en.wikipedia.org/wiki/Singular_value_decomposition ERM more info: http://en.wikipedia.org/wiki/Empirical_risk_minimization

A High Performance Semi-Supervised Learning Method for Text Chunking

A High Performance Semi-Supervised Learning Method for Text Chunking

Presentation Transcript

Semi-supervised Learning

Semi-Supervised Learning over Text

Supervised learning for text

Semi-Supervised Learning

Semi-Supervised Learning

Semi-supervised learning

Semi-Supervised Learning

Semi-supervised learning

Semi-Supervised Learning

Semi-supervised Learning

Inductive Semi-supervised Learning

Supervised and semi-supervised learning for NLP

Gene family classification using a semi-supervised learning method

Word representations: A simple and general method for semi-supervised learning

Semi-Supervised Learning

Semi-Supervised Learning

Semi-supervised Learning

Semi-Supervised Learning

COMP3503 Semi-Supervised Learning

Semi-Supervised Learning