350 likes | 488 Views
Crowdsourcing Insights: Opinion Space. Ken Goldberg , IEOR, School of Information, EECS , UC Berkeley Alec Ross, Director of Innovation, U.S. State Dept. Berkeley Center for New Media (BCNM): David Wong: EECS Undergraduate Student Tavi Nathanson: EECS Graduate Student
E N D
Crowdsourcing Insights: Opinion Space Ken Goldberg, IEOR, School of Information, EECS, UC Berkeley Alec Ross, Director of Innovation, U.S. State Dept
Berkeley Center for New Media (BCNM): David Wong: EECS Undergraduate Student Tavi Nathanson: EECS Graduate Student Ephrat Bitton: IEOR Graduate Student Siamak Faridani: IEOR Graduate Student Elizabeth Goodman: School of Information Graduate Student Alex Sydell: EECS Undergraduate Student Meghan Laslocky: Outside Consultant on Content Ari Wallach: Outside Consultant on Content and Strategy Steve Weber: Outside Consultant on Content Peter Feaver: Outside Consultant on Content U.S. State Department: Alec Ross: Senior Advisor for Innovation Katie Dowd: New Media Director Daniel Schaub: Director for Digital Communications
“We’re moving from an Information Age to an Opinion Age.” - Warren Sack, UCSC
Motivation Goals of Organization • Engage community • Understand community • Solicit input • Understand the distribution of viewpoints • Discover insightful comments Goals of Community • Understand relationships to other community members • Consider a diversity of viewpoints • Express ideas, and be heard
Motivation Classical approaches: surveys, polls Drawbacks: limited samples, slow, doesn’t increase engagement Current approaches: online forums, comment lists Drawbacks: data deluge, cyberpolarization, hard to discover insights
Related Work: Visualization Clockwise, starting from top left: Morningside Analytics, MusicBox, Starry Night
Related Work: Info Filtering • K. Goldberg et al, 2001: Eigentaste • E. Bitton, 2009: spatial model • Polikar, 2006: ensemble learning
Six 50-minute Learning Object Modules, preparation materials, slides for in-class lectures, discussion ideas, hand-on activities, and homework assignments.
Dimensionality Reduction Principal Component Analysis (PCA) • Assumes independence and linearity • Minimizes squared error • Scalable: compute position of new user in constant time
Canonical Correlation Analysis (CCA) z • 2-view PCA • Assume: • Each data point has a latent low-dim canonical representation z • Observetwo different representations of each data point (e.g. numerical ratings and text) • Learn MLEs for low-rank projections A and B • Equivalently, pick projection that maximizes correlation between views x y Graphical model for CCA x = Az + ε y = Bz + ε z = A-1x = B-1y
Multidimensional Scaling • Goal: rearrange objects in low dim space so as to reproduce distances in higher dim • Strategy: Rearrange & compare solns, maximizing goodness of fit: • Can use any kind of similarity function • Pros • Data need not be normal, relationships need not be linear • Tends to yield fewer factors than FA • Con: slow, not scalable j δij i j dij i
Kernel-based Nonlinear PCA • Intuition: in general, can’t linearly separate n points in d < n dim, but can almost always do so in d ≥ n dim • Method: compute covariance matrix after transforming data into higher dim space • Kernel trick used to improve complexity • If Φ is the identity, Kernel PCA = PCA
Kernel-based Nonlinear PCA Input data KPCA output with Gaussian kernel • Pro: Good for finding clusters with arbitrary shape • Cons: Need to choose appropriate kernel (no unique solution); does not preserve distance relationships
Stochastic Neighbor Embedding • Converts Euclidean dists to conditional probabilities • pj|i = Pr(xi would pick xj as its neighbor | neighbors picked according to their density under a Gaussian centered at xi) • Compute similar prob qj|i in lower dim space • Goal: minimize mismatch between pj|i and qj|i: • Cons: tends to crowd points in center of map; difficult to optimize
Opinion Space: Crowdsourcing Insights Scalability: n Participants, n Viewpoints n2 Peer to Peer Reviews Viewpoints are k-Dimensional Dim. Reduction: 2D Map of Affinity/Similarity Insight vs. Agreement: Nonlinear Scoring Ken Goldberg, UC Berkeley Alec Ross, U.S. State Dept
Six 50-minute Learning Object Modules, preparation materials, slides for in-class lectures, discussion ideas, hand-on activities, and homework assignments.
Opinion Space Wisdom of Crowds: Insights are Rare Scalable, Self-Organizing, Spatial Interface Visualize Diversity of Viewpoints Incorporate Position into Scoring Metrics Ken Goldberg UC Berkeley
collaborative robot control: … … Batch … MultiTasking … Collaborative