180 likes | 356 Views
An Efficient Concept-Based Mining Model for Enhancing Text Clustering. Presenter : JHOU, YU-LIANG Authors :Shady Shehata , Fakhri Karray , Mohamed S. Kamel , Fellow 2012 , IEEE. Outlines. Motivation Objectives Methodology Evaluation Conclusions Comments. Motivation.
E N D
An Efficient Concept-Based Mining Model for Enhancing Text Clustering Presenter : JHOU, YU-LIANGAuthors :Shady Shehata, FakhriKarray, Mohamed S. Kamel, Fellow2012, IEEE
Outlines • Motivation • Objectives • Methodology • Evaluation • Conclusions • Comments
Motivation • In text mining ,the term frequency is computed to explore the importance of the term in document. • However, two terms can have the same frequency in documents, but one term contributes more to the meaning of its sentences than the other term.
Objectives Using Concept-Based Mining Model for Text Clustering , improve the clustering quality.
MethodologyCONCEPT-BASED MINING MODEL Ex: a concept cwhich appears twice in document d in the first and the secondsentences The concept c appears fivetimes in the verb argument structures of the first sentence s 1 , and three times in the verb argument structures of the second sentence s 2 . ans : ctf value = (5+3)/2=4
MethodologyExample of Conceptual Term Frequency . [ARG0 Texas and Australia researchers] have [TARGET created] [ARG1 industry-ready sheets of materials made from nanotubes that could lead to the development of artificial muscles]. [ARG1 materials] [TARGET made ] [ARG2 from nanotubes that could lead to the development of artificial muscles]. [ARG1 nanotubes] [R-ARG1 that] [ARGM-MOD could] [TARGET lead] [ARG2 to the development of artificial muscles].
MethodologyExample of Conceptual Term Frequency 1. First verb argument structure for the verb created: . [ARG0 Texas and Australia researchers] . [TARGET created] . [ARG1 industry-ready sheets of materials made from nanotubes that could lead to the development of artificial muscles]. 2. Second verb argument structure for the verb made: . [ARG1 materials] . [TARGET made] . [ARG2 from nanotubes that could lead to the development of artificial muscles]. 3. Third verb argument structure for the verb lead: . [ARG1 nanotubes] . [R-ARG1 that] . [ARGM-MOD could] . [TARGET lead] . [ARG2 to the development of artificial muscles].
MethodologyExample of Conceptual Term Frequency 1. Concepts in the first verb argument structure of the verb created: . Texas Australia researchers . created . industry-ready sheets materials nanotubes lead development artificial muscles 2. Concepts in the second verb argument structure of the verb made: . materials . nanotubes lead development artificial muscles 3. Concepts in the third verb argument structure of the verb lead: . nanotubes . lead . development artificial muscles.
Conclusions The new approach enhance text clustering quality.
Comments Advantages Improve the text clustering quality. Applications -Concept-based mining model -Conceptual term frequency