1 / 63

Beautiful Data

Beautiful Data. Lecturer: Dr. Bo Yuan E-mail: yuanb@sz.tsinghua.edu.cn. Exploring Millions of Social Stereotypes. How old do they look? Do you think they look smart? How do we perceive age, gender, and attractiveness?. Data Analysis!. The FaceStat Judging Interface.

barny
Download Presentation

Beautiful Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Beautiful Data Lecturer: Dr. Bo Yuan E-mail: yuanb@sz.tsinghua.edu.cn

  2. Exploring Millions of Social Stereotypes

  3. How old do they look? • Do you think they look smart? • How do we perceive age, gender, and attractiveness? Data Analysis!

  4. The FaceStat Judging Interface

  5. Preprocessing the Data Problematic data Aggregate results from multiple people into a single description Map from multiple-choice responses to one numerical value

  6. Exploring the Data Initial scatterplot matrix of the face data

  7. Exploring the Data Initial histogram of face age data

  8. Exploring the Data Histogram of cleaned face age data

  9. Age, Attractiveness, and Gender Scatterplot of attractiveness versus age, colored by gender

  10. Age, Attractiveness, and Gender Smoothed scatterplots for attractiveness versus age, colored by gender

  11. Age, Attractiveness, and Gender Three iterations of plotting attractiveness versus age versus gender:(a) ages averaged within buckets per age year, (b) 95% confidence interval for each bucket, plus loess curves, and (c) larger buckets where the data is sparser.

  12. Age, Attractiveness, and Gender Pearson correlation matrix

  13. Clustering Attractiveness versus age, colored by cluster, 2000 points.

  14. Clustering Cluster centroids, tags, and exemplars

  15. Clustering Cluster centroids, tags, and exemplars

  16. Conclusion • Our data indicates some familiar stereotypes. • Women are considered more attractive than men • Age have a stronger attractiveness effect for women than men • Also some potential surprises. • Babies are most attractive • Conservatives look more intelligent • The point of this instance is not to come to any particular conclusion. • Instead, we want to show some examples of the rich set of significant patterns contained in large, messy data set of human judgments.

  17. Visualizing Urban Data

  18. Crimespotting Project

  19. Home Page

  20. How to Get the Crime Data?

  21. A Sample Image A sample image from CrimeWatch shows areas of the theft, narcotics, robbery, and other crimes.

  22. A Sample Image The same sample image from CrimeWatch with programmatically recognized icons outlined.

  23. A Sample Image The same sample image with the reddish parts made white to show the red boxing glove icon more clearly.

  24. Geolocation A map of downtown Oakland showing three reference points for triangulation purposes.

  25. The Spotlight Feature The type selector shows the total numbers of each report type in the selected time span

  26. Conclusion • Crime is a serious issue for any urban resident, by visualizing the crime data can we effectively protect the citizens. • The project has been a productive success, resulting in what we believe is a data service maximally useful to local residents. • City and government information is being moved onto the Internet to match the expectations of a connected, wired citizenry. • For more information about Crimespotting: • http://oakland.crimespotting.org/

  27. Beautiful Political Data

  28. Data Help Obama Win

  29. Redistricting and Partisan Bias • Redistricting • Redistricting is the process of drawing United States electoral district boundaries, often in response to population changes determined by the results of the decennial census. • Partisan Bias • Partisan bias is a measure of how much the electoral system favors the Democrats or Republicans, after accounting for their vote share.  

  30. Redistricting and Partisan Bias Effect of redistricting on partisan bias

  31. Time Series of Estimates

  32. Age and Voting Sure, young people voted heavily for Mr.Obama, but they voted heavily for John Kerry. ----Mark Penn, Political Consultant Was he right?

  33. Age and Voting Some graphs showing recent patterns of voting by ages

  34. Localized Partisanship in Pennsylvania Geographic partisanship in Pennsylvania

  35. Conclusion • Political data is increasingly accessible and is increasingly being plotted and shared in the media and on the web. • At the research level, articles in political science journals are starting to make use of graphical techniques for discovery and presentation of results. • Statistical visualization to become more important and more widespread in political analysis.

  36. Data Finds Data

  37. Data Finds Data • An example • Corruption at the Roulette Wheel • Past Posting

  38. Data Finds Data • What can data finds data system do for us? • Guest Convenience • Customer service On the way to “data finds data”:

  39. Data Finds Data • What can data finds data system do for us? • Improved Child Safety • Cross-compartment Exploitation

  40. Data Finds Data • What should we solve first? • All examples benefit from just in-time discovery. • However, we should solve the “enterprise discoverability” problem. • Federated search • Do not have the indexes necessary to enable the efficient location of a record. • Requires recursive processing. • Federated search cannot support the “data finds data” mission, because it has no ability to deliver on enterprise discoverability at scale. • Directories are necessary!

  41. Conclusion • Determine how new observations relate to what is known. • Differentiate one organization from another. • Likely become another building block from which next generations of advanced analytics will benefit.

  42. Exploring Your Life in Data

  43. Exploring Your Life in Data • Web: • About sharing, broadcasting and distributing. • About tracking, monitoring, analyzing his\her habits and behaviors. • Tools: • PEIR & YFD • Difference: • PEIR runs in the background and automatically upload data. • YFD requires that users actively enter data.

  44. Some Examples • DietSense • Family Dynamics • Walkability • Thanks to built-in sensors. • All bring people involved in their communities with just their mobile phones.

  45. Visualization • Traces are colored based on impact and exposure values. • A different mapping scheme that make all trips on the map mono-color, using circles to encode impact and exposure. • All traces are colored white, and the model values are visually represented with circles that varies in size at the end of each trip. • Greater values are displayed as circles larger in area while lesser values are smaller in area.

  46. Visualization • We grayscaled map tiles and inverted the color filters so that map items that were originally lightly colored turned dark and vice versa. • To be more specific, the terrain was originally lightly colored, so now it is dark gray, and roads that were originally dark are now light gray. • This darkened map lets lightly colored traces stand out, and because the map is grayscale, there is less clashing.

  47. Visualization • PEIR provides histograms to show distributions of impact and exposure for selected trips.

  48. PEIR Interface

  49. Design of Interface in YFD

  50. Track of Feelings and Emotions

More Related