90 likes | 109 Views
This paper presents methods for processing Chinese commonsense data, including context finding, conceptual analogy, dirty words filtering, and link prediction. It explores techniques like spam filtering, machine learning, and social network link prediction, with application in question-answering systems. References provided include works on commonsense reasoning and link prediction in social networks.
E N D
Chinese Commonsense Processing Presented by Yen-Ling Kuo 2009/3/30
Remember how to build ConceptNet 2…? • The data we collected: Context Finding Conceptual Analogy Similarity Dirty Words Filter Swapping List Link Prediction Add K-Line Rapport Game Pet Game subject subject relation 聖誕節 吃大餐 __ 的時候,你會 __ (2, 1, 0) frequency good rank bad rank
Dirty Words Filter • Idea: Spam filter • Machine learning • Matching/Fuzzy hashing • Black list • Attribute selection • 近朱者赤,近墨者黑 • Node degree • Ratio of bad rank • # rank • # neighbors in black list • Distance to black list • Confidence of users Subjects Black list White list Attribute Selection Classification Bad subjects Good subjects
Link Prediction • Idea: Social network link prediction • Application of social network link predication • In Question-Answering Bulletin Board(QABB): 1. Recommend potential answers based on previous communications 2. Predict future hot questions
Link Prediction Methods • Node attribute: not always available • Structural property Node based topological pattern Path based topological pattern
Link prediction in ChineseCommonsense? • Use both node attribute and structural property → Modeled as a supervisedlearning problem • Giveweightstolinksaccordingtofrequencyandgood/badranks. • Node attribute 詞類, relation types • Structural property Distance, Weightedcommon neighbors,WeightedAdamic/Adar,Typesofneighbors,Katz Weighted common neighbors
Context Finding • Determine the context around a concept is useful for building applications. • Context finding is similar to memory search.→ Use spreading activation from a source node to get the contextual neighborhood. 0.1 枕頭 0.1 Ask 睡覺 1 ※Different relation with different weight. 浴室 0.2 0.6 刷牙 0.12 0.6
Conceptual Analogy • Employ structure-mapping methods to get a list of structurally analogous concepts given a source concept. • Structural analogy is not just a measure of semantic distance, ex. “wedding” and “bride”. 狗 貓 貓 有 __ 狗 有 __ 貓 喜歡 __ 狗 喜歡 __ 貓 and 狗 are conceptually analogous concepts. 貓 會 __ 狗 會 __
Reference • Hugo Liu, Push Sing. Commonsense Reasoning in and over Natural Language, Lecture Notes in Computer Science, 2004. • David Liben-Nowell, Jon Kleinberg. The Link Prediction Problem for Social Network, Proceedings of CIKM, 2003. • Tsuyoshi Murata and Sakiko Moriyasu. Link prediction of social networks based on weighted proximity measures, International Conference on Web Intelligence, 2007