200 likes | 327 Views
Q/A System. First Stage: Classification Project by: Abdullah Alotayq , Dong Wang, Ed Pham. Query Processing. Classification Package: Mallet Classifiers: Maxent , DecisionTree , C45, NaiveBayes , AdaBoost , Winnow, Balanced Winnow, Bagging Trainer . etc. Main Techniques. Features.
E N D
Q/A System First Stage: Classification Project by: Abdullah Alotayq, Dong Wang, Ed Pham
Query Processing • Classification Package: Mallet • Classifiers: Maxent, DecisionTree, C45, NaiveBayes, AdaBoost, Winnow, Balanced Winnow, Bagging Trainer .etc
Features Semantic Morphological Neighboring (Syntactic)
Stemming • nltk stemmer
N-grams • Bigrams:
Trigrams: • Poor Classification results • 0.48 • 0.478 • Not A good strategy .
NER (Named Entity Recognition) • nltk NER • pre-trained model to do this task. • 6 types of NE
Frequencies Training Data:
NO Named Entity detected • In training data: 3533, namely 64.8% • In test data, 353, 70.6%. -> data sparseness problem
NER Results & Future work • Test data accuracy= 0.802 • we might try other NE tools, which would give more NE types and cover more percentage on training and test data.
Binary and Real Values • Testing for potential improvement. • Best performing classifiers: For Binary: • BalancedWinnow: Test data accuracy= 0.804 • MaxEnt: Test accuracy mean = 0.78 For Real Values: • BalancedWinnow: Test data accuracy= 0.784 • MaxEnt: Test data accuracy= 0.758
Proposed future improvement • WordNetSenses • Class-Specific Related Words
Issues • Performing poorly on some refinements. • Low accuracy scores: • 0.42 • 0.54 • Memory consuming classifiers. • Classifiers showed some error messages.
Successes • Made progress in creating the system. • Had some hands-on experience dealing with classifiers, and NLP packages. • Learned ways to improve classification results.
Readings that helped • Employing Two Question Answering Systems in TREC-2005, SandaHarabagiu & others.
Software packages participated • Mallet • NLTK • Porter-stemmer • Self-written code files • Stanford Parser, Berkeley Parser