500 likes | 688 Views
Need-based Product Review Mining. Weiwei zauri.ww@gmail.com. Outline. Introduction: Traditional Product Review Mining Change to “Need-based Product Review Mining” Research Area Technology Related Need Recognition Feature(explicit & implicit) Extraction Opinion Extraction
E N D
Need-based Product Review Mining Weiwei zauri.ww@gmail.com
Outline • Introduction: • Traditional Product Review Mining • Change to “Need-based Product Review Mining” • Research Area • Technology Related • Need Recognition • Feature(explicit & implicit) Extraction • Opinion Extraction • Scoring and Ranking • Conclusion
Traditional Product Review Mining • Product-centric(Product-based): • Process: • Select a product Review mining Structural visualization • Paper: • Liu.B[KDD04, WWW05], Dave.K[WWW03], Turney.P[ACL02], Liu[KDD08] etc. • An example: • CRO
An example: CRO • [1]Select a product(or input a product)
An example: CRO • [2]Review Mining of the corresponding product
An example: CRO • [3]Structural visualization
Change to “Need-based Mining” • Motivation – “Online Purchasing Analyze” • “Customer seek to satisfy a particular need”[Kotler03] • Vs “Traditional Store-Purchasing” • A clerk to help you(Store-purchasing) • Using online chat software to interact customers(Online-purchasing) • What are they talking about? • Help to translate their “need” to a specific product
Change to “Need-based Mining” • Motivation • Why doing this?-”why not let customer do this alone?” • Don’t know what the product attributes mean • Only have a need in mind • Need a recommended products list satisfying their need • How to translate need is a problem
Need-based Mining • Need-based(or user-centric) • Focus on multi-products of a product category(not a single product) • Associate “need” to a set of attributes of the product • Recommend products by sentiment analysis towards the attributes above
Research Framework Customer Product Review Product Traditional Product Review Mining Research Framework CSS Research Framework Need Customer Product Review Product
Research Framework a rank list of product a need Product scoring Need recognition Aggregation function Online Sentiment analysis Offline Onto construction <Feature, opinion>pairs, include implicit feature identify Merge similar feature Feature extraction Opinion extraction Review DB
Technology related Need Recognition Feature Extraction Opinion Extraction(sentiment analysis) Scoring and Ranking
1.Need Recognition • Need definition: • “Feeling of want that provides a basis for behavior or action”(name) • These words are implicitly related to a set of features of a product category • Each feature has a weight • Need = <n, F, W> • Some examples: • “a camera for climbing”: <“Climbing”, {size, weight, wide-angel}, {0.3, 0.5, 0.2}> • “a sun-resistant cosmetic”: <“sun-resistant”, {whitening, price}, {0.8, 0.2}>
Need Recognition Degree of association (DOA) calculation -PMI, LSA, etc. Feature clusters 1 2 Need name 3 4
Introduction to PMI • Fact object and discriminator object • Based on co-occurrence of words • PMI(f, d) = • PMI=0, independent; PMI>0, dependent • Estimation-”PMI-IR[21]” • Constraint: Near, And, etc[23 AltaVista].
DOA Calculation • How to find and quantify the association between two objects? Ideal Condition Product Reviews corpus(Full set)
DOA Calculation Actual Condition + PMI-IR[21] likealgorithm Reviews corpus
Need Recognition Find the features set F and weight F->{F1, F3} W ->{W1, W3} Feature clusters 1 2 Need name 3 4
Feature Set and Weight • Feature choosing condition: • Degree of association(DOA) ≧δ(threshold) • Set’s Weight Calculation: • Need Description: • Need = <name, F, W>
2.Feature Extraction(Onto Cons) • Related Work: • Supervised method. • Unsupervised method. • Disadvantages: • Similar features clustering problem(Concept relationship discovery) • Implicit features recognition problem
Feature Extraction • What is feature(>attribute)? • Not only the product parameters(attribute) • All the comment aspects of the product • “Official parameter specification” + “consumer comment aspects” • Features are infinite • Explicit feature and Implicit feature
Feature Extraction(Explicit Feature)[17] Candidate Feature Set Relevant Product Review corpus Feature Set Noisy filter Irrelevant Product Review corpus Supplementary Similar feature clustering
Similar Feature Clustering • Related works: • [14], [15], [18]-” Reinforcement Clustering heterogeneous web objects” • First problem: • How to pre-define the similar feature? • Synonym features, the same aspects of the product • Experiments: • Content only to cluster similar features • Link only to cluster similar features • Content plus link to cluster similar features
Experiments Opinions features
Content-based method • Similarity calculation: • Clustering • PAM • Measurement: • Entropy, Precision, Recall, F-Measure.
Link-based method • Similarity calculation: • SimRank[19] • Clustering: • PAM • Measurement: • Entropy, Precision, Recall, F-Measure.
Content plus link method Opinions features
Content plus link method Opinions features
Content plus link method Opinions features
Content plus link method Opinions features
Content plus link method Opinions features
Content plus link method Opinions features
Feature Extraction(Implicit Feature) Opinions Confident Value: Opinion(w)->F(i) features
Feature Relationship Learning • 同义 • 上下位 • 部分/整体
3.Opinion Extraction(<Sentiment Analysis) • Sentiment Analysis • Sub/obj text classification, sentiment tracking, product opinion mining, etc. • Opinion Extraction • Context-based opinion polarity identification
What is opinion? • Opinion • Words or phrases express semantic orientation(Positive, Negative or Neutral) • Context independent opinion(“good”, “bad”, etc) • Context dependent opinion(“big”, “small”, etc) • Opinion semantic orientation identification • Context independent opinion • Context dependent opinion
Related Works • Context independent opinion • WordNet-based method [1, 2, 5] • Seed list, Incremental • PMI-SO method [Turney 24] • Seed list(“excellent”, “awful”, etc) • Context dependent opinion • Syntactic rules(conjunction, disjunction, etc) • [Ding 20] • Semantic Clustering based • [Liu 5], [Yang “Study of Structurizing Chinese Product Review”]
Problems • Find the context of opinion word • Word level • Eg: “good”, “bad”, etc. (Context independent opinion) • <feature, opinion> pair level • Eg: “The camera is too heavy”, <camera, heavy>-negative • Sentence level • Eg: “The camera is very shining but I don’t like it.” • Almost all the research don’t consider this problem • Split by “but”, <camera, shining>-positive (Actually is negative here)
Future works • Try to tackle these problem, especially 3.
4.Product Scoring and Ranking • Related Work: • Product Recommendation based on reviews • [9], [12], etc. • Problem: • Only consider one feature at a time[12 Red Opal] • A need always has several features • All the reviews are equal[all] • Different reviews express different need • Only consider numerical scores(always total scores)[3, 4, 12] • Maybe in a review fa‘s polarity is negative, fb’s polarity is positive, but the reviewer gives the score is 3 star
Product Scoring and Ranking • Need-based Product Recommendation • Focus on multi-features at a time • Weight each review by their satisfactory of the giving need • Topic-based opinion extraction • Need = <n, F, W> • n: a word or phrase reveal the consumer need • F: feature set • W: weight of each feature
Product Scoring and Ranking • Product Scoring • Product Ranking • Product scores, NA(need association), etc.
Reference [1] Liu.B. Opinion Observer: Analyzing and Comparing Opinions on the Web. WWW05 [2] Liu.B. Mining and Summarizing Customer Reviews. KDD04 [3] Turney.P. Thumbs Up or Thumbs Down?: Semantic Orientation Applied to Unsupervised Classification of Reviews. ACL02 [4] Dave.k. Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. WWW03 [5] Liu. CRO: a system for online review structurization. KDD08 [6] Kotler.P. Marketing Management. Prentice Hall 2003. [7] Orman.L. Consumer Support System. Communications of ACM 2007 [8] Lee.T. Need-based Analysis of Online Customer Reviews. ICEC07 [9] Lee.T. Needs-Centric Searching and Ranking Based on Customer Reviews. ICEC08 [10] Lee.T. Use-centric mining of customer reviews. WITS04 [11] Lee.T. Constraint-based Ontology Induction from Online Customer Reviews. Group Decision and Negotiation
Reference [12] Scaffidi.C. Red Opal: Product-Feature Scoring from Reviews. ACM-EC 07 [13] Scaffidi.C. Application of a Probability-based Algorithm to Extracting of Product Features from Online Reviews. CMU Technical Report 06 [14] H. J Zeng. A unified framework for clustering heterogeneous web objects. ICWISE 02. [15] Q.Su. Hidden Sentiment Association in Chinese Web Opinion Mining. WWW 08 [16] X.Y Du. A Survey on Ontology Learning Research. Journal of Software 06 [17] W.Wei. Extracting Feature and Opinion Words Effectively from Chinese Product Review. FSKD 08 [18] J.D Wang. ReCoM: Reinforcement Clustering of Multi-Type Interrelated Data Objects. SIGIR 03 [19] G.J. SimRank: A measure of Structural-Context Similarity. SIGKDD 02 [20] X.W Ding. A Holistic Lexicon-Based Appraoch to Opinion Mining. WSDM 08
[21] P.Turney. Mining the Web for Synonyms: PMI-IR Versus LSA on TOEFL. ECML 01 [22] Manning.C. Foundations of Statistical Natural Language Processing. MIT Press 1999 [23] AltaVista: AltaVista Advanced Search Cheat Sheet. Alta Vista Company 01 [24] A.Maria. Extracting Product Features and Opinions from Reviews. EMNLP 05
End. Any question?