170 likes | 271 Views
Data Mining Reading and Sample Application. Xietao Sept. 2013. outline. Basic Info Paper Glance. Basic Info. SIGMOD VLDB ICDE PODS --- Database KDD --- Data Mining SNA-KDD --- workshop on SNS ICML --- Machine Learning SIGIR --- Info retrieval. Course&Tools. Andrew Moore
E N D
Data Mining Reading and Sample Application Xietao Sept. 2013
outline • Basic Info • Paper Glance
Basic Info • SIGMOD VLDB ICDE PODS --- Database • KDD --- Data Mining • SNA-KDD --- workshop on SNS • ICML --- Machine Learning • SIGIR --- Info retrieval
Course&Tools • Andrew Moore • Coursera (Bio-Datamining) • OCW-MIT • Weka --- waikato (New Zealand) • Rapid Miner --- Yale • IlliMine --- UIUC • Alpha Miner --- HKU • Potter’s wheel A-B-C --- UCB
Paper Glance • ICML: • Neural Network\PCA\SVM\Framework • “Data-driven Web Design” • KDD: • Algorithm\Classfier\SNS\Cluster\Singular • “Learning from Crowds in the presence of Schools of Thought” • SNA-KDD • Twitter\Facebook\Weibo\Influence\Rumor • “Language-independent Bayesian sentiment mining on Twitter ”
Data-Driven Web Design • Conf: ICML 2012 • Author: • Ranjitha Kumar Stanford University • Jerry O. TaltonIntel Corporation • Salman Ahmad MIT • Scott R. KlemmerStanford University
Abstract • Applying machine learning methods to web design problems • Structured prediction • Deep learning • Probabilistic program induction • Enable useful interactions for designers
Detail • Structured prediction : Rapid retargeting • Deep learning : Design-based Search • Probabilistic program induction : Operationalizing design patterns
Learning from Crowds in the Presence ofSchools of Thought • Conf : KDD 2012 • Author: • YuandongTian CMU • Jun Zhu THU
Abstract • Crowdsourcing: effective way to collect large-scale experimental data from distributed workers • Target: Identify reliable workers as well as unambiguous tasks
Detail • Gold standard: task is objective with one correct answer • Schools of thought: each task may have multiple valid answers
Language-independent Bayesian sentiment mining on TwitterLanguage-independent Bayesian sentiment mining on Twitter • Conf : SNA-KDD 2011 • Author: • Alex Davies University of Cambridge • ZoubinGhahramani University of Cambridge
Abstract • New Language-independent model for sentiment analysis of short, social-network statues • Machine learning \ Bayesian Classfier
Detail • Tweet is short, Senti-Icon shows a lot • Asymmetric Dirichletdistribution for word probability on sentiment • Iteratively Update the distribution and compute the probabilily
More • “Joint Optimization of Bid and Budget Allocation in Sponsored Search” • KDD 2012 by SJTU • “Analysis and identification of spamming behaviors in SinaWeibomicroblog” • SNA-KDD 2013 by SJTU