Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language

Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language Guillaume Pitel, Gregory Grefenstette LREC2008

Manually Built Resources • Defining Semantic Dimensions of Affect

Manually Built Resources • Creating seed words • L1 : For each dimension, select 2 to 4 words. Total 229 seed words. • L2 : Extended L1 to average 10 words per class. Total 881 seed words.

Manually Built Resources • Creating gold standard • L3 : Using a synonyms dictionary(*), and manually deleting some words by a human annotator. • Total 4980 word-to-class relations (3513 distinct words, a word can belong to more than one class.) • L2 was included, so leaving 2632 words for evaluation.

Classifying affect words along theirdimensions • SL-dLSA+SVM • Semantic Likeliness from diversified LSA and SVM. • δ ∈ [1..10, 15, 20, 25, 30] : window size. • Considered the windows [0, + δ], [− δ, + δ], [− δ, 0]. • For each word, each window will create 300 dimensions LSA vector. • Total 12600 dimensions. • Raw cooccurence matrices would have totalized some 5.3 million dimensions. • A 44 class SVM classifier was trained.

Trained on L1 Trained on L2 Scores of the SL-dLSA+SVM 44 class classifier

Classification of the word “désagrément” using SL-dLSA+SVM with L2 Classification of the word “disgrâce” using SL-dLSA+SVM with L2 Scores of the SL-dLSA+SVM 44 class classifier =Annoyance, unpleasantness =disgrace, disfavour

Classifying with SL-PMI measure • Semantic Orientation Pointwise Mutual Information (Turney and Littman, 2002) • SO-PMI measure is intended to evaluate the positiveness/negativeness of a given word. • They adapt SO-PMI to a likeliness measure.

Classifying with SL-PMI measure • SL-PMI_C(Semantic Likeliness Pointwise Mutual Information from Information Retrieval for class C) • H_δ(w1, w2) is the number of cooccurrences of words w1 and w2 in a δ words window.

Trained on L1 Trained on L2 Scores of the SL-PMI 44 classes classifier

Classifying with SL-LSA measure • As for the SO-PMI, the original SO-LSA measure is intended to evaluate the positiveness/negativeness of a given word. • LSAδ(w) is the vector representing word w in a LSA space built with a δ words window.

Trained on L1 Trained on L2 F-Scores for the SL-LSA 44 classes classifiers

Using L1 as the training data. Using L2 as the training data. F-scores of the classification methods

Improvement ratios between L2 and L1 F-scores

Perspectives • They we did not evaluate the SVM classifier on simple LSA feature spaces. • SL-LSA family of classifiers • had similar f-score, but their kappa agreement were very low(0.26~0.34). • Select the correct answers from SL-LSA(L2,30) and SL-LSA(L2,2), the f-score would raise from 0.13 to 0.19. • Train a SL-dLSA+SVM classifier using L3 data.

Perspectives • Some of classes are partial overlapping. • Advantage and Facilitation • Comfort and Pleasure • Admiration and Praise • See page 7

Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language

Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language

Presentation Transcript

Semi-automatic methods for WordNet construction

affect – a question of method?

A New Flexible Method for Workflow Management

Building a sentential model for automatic prosody evaluation

A New Method For Numerical Constrained Optimization

SPFDs - A new method for specifying flexibility

A High Performance Semi-Supervised Learning Method for Text Chunking

A New Method For Measuring Extragalactic Distances

BUILDING FOR A NEW TOMORROW

A new Method For Targeting enveloped viruses

A Semi-automatic Approach for Bridging DSLs with UML

New Prefixes for the Dictionary

Building a Dictionary from WWW

Orange: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation

GLASS : A Graphical Query Language for Semi-Structured Data

A New Regime for Building Control?

Building a dictionary for genomes

A Semi-Automatic System for Pollen Recognition

Building a Dictionary of Image Fragments

Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

Building a dictionary for genomes

Semi Automatic Covers for Bike & Scooty