1 / 20

Predicting the Semantic Orientation of Adjective

Predicting the Semantic Orientation of Adjective. Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi. Aim . To validate that conjunction put constraints on conjoined adjectives and this information can be used to detect their semantic orientation

marci
Download Presentation

Predicting the Semantic Orientation of Adjective

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi

  2. Aim • To validate that conjunction put constraints on conjoined adjectives and this information can be used to detect their semantic orientation • Based on above information cluster adjectives into two groups representing adjectives with positive and negative orientation.

  3. Constraint On Conjoined Adjectives • Validate constraints from conjunction on positive/negative semantic orientation of adjectives • Honest ‘and’ peaceful – same orientation • Talented ‘but’ Irresponsible – opposite orientation • Thus conjunction affect semantic orientation • Synonyms may have same semantic orientation • Antonyms may have opposite semantic orientation ( hot and cold).

  4. Approach • Extract conjunction from corpus with their morphological relation • A log-linear regression model to predict orientation of two different adjectives • A clustering algorithm separates the adjectives into two subset of same or opposite orientation.

  5. Data • 21 million word 1987 Wall Street Journal Corpus annotated with part-of-speech tags • Remove adjectives occurring less than 20 times and those which had no orientation. • Manually assign orientation to each adjective based on use of adjective • Multiple validation of labeled adjectives was done. • Final Set – 1336 adjective – 657 positive and 679 negative – with 96.97% inter-reviewer agreement.

  6. Validating the Hypothesis • Run parser on 21 million words dataset to get 15,048 conjunction tokens involving 9,296 pairs of distinct adjective pairs. • Each conjunction was classified into : 1.)conjunction used ; 2.)type of modification ; 3.)modified noun • Count percentage of conjunction in each category with adjectives of same or different orientation

  7. Validating Hypothesis

  8. Validating Hypothesis • For almost all the cases p-values are low. Hence the statistics are significant. • There are very small differences in behavior of conjunctions • ‘and’ usually joins adjectives of same orientation • ‘but’ is opposite and joins adjectives of different orientation

  9. Baseline Method to Predict Link • Simple baseline method – to call each link as same orientation will give 77.84% accuracy • Adjective con-joined by ‘but’ are mostly of opposite orientation • Morphological relationship (e.g. : adequate-inadequate) contains information as well

  10. Better Idea – Use regression model • Train a log Linear Regression Model • x is the observed count of adjective pair in various conjunction category. • To avoid over fitting they used subsets of data. • Process of iterative stepwise refinement leads to building up of final model

  11. Result of Prediction • Log Linear Regression models performs slightly better than baseline • Mainly used to group adjectives into same group

  12. Grouping Adjectives into same pack • Log Linear model generates a dissimilarity score between two adjective between 0 and 1 • Same and different adjectives thus form a graph • Iterative Optimization procedure is used to partition graph into clusters. • Minimize : • Hierarchical Clustering

  13. Labeling Clusters • Same authors in ‘95 showed that a semantically unmarked member of gradable adjectives is the most frequent. • Now semantic markedness exhibit a strong correlation with orientation • Unmarked member always have positive orientation • So group with higher average frequency contains positive terms.

  14. Evaluating Clustering of Adjectives • Separate the Adjective set A into training and testing groups by selecting a parameter named α. • α is the parameter which decides the number of link of each adjective in the selected training and test set. • Higher α creates subset of A such that more adjectives are connected to each other.

  15. Clustering Results • Highest accuracy obtained when highest number of links were present. • Every time - ratio of group frequency correctly identified the positive subgroup

  16. Classification Example

  17. Performance • To measure performance of algorithm a series of simulation experiments were run. • Parameter P measures how well each link is predicted independently – Precision • Parameter k – number of distinct adjective each adjectives appears in conjunction with. • Generate Random Graph between nodes such that each node participated in k links and P% of all nodes connected same orientation and classify them

  18. Results

  19. Conclusion • A good ‘and’ comprehensive method for classification of semantic orientation of adjectives. • Can be used to find antonyms without accessing any semantic information • Can be extended to nouns and verbs.

  20. Thank You!

More Related