1 / 90

The Intersection of Language, Algorithms, and Design: How Word Clouds Can Improve Understanding and Engagement

Explore the use of word clouds in language and text analysis, and the challenges they pose in conveying meaningful information. Discover how semantically grouped word cloud designs can improve understanding while retaining engagement.

kbarajas
Download Presentation

The Intersection of Language, Algorithms, and Design: How Word Clouds Can Improve Understanding and Engagement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Intersection of Language, Algorithms, and Design Marti Hearst, UC Berkeley UCSD Design@Large October 31, 2018

  2. Measure what they can, not what they should “Good” Model: baseball stats Do not adjust agilely to error WMD: US News College Rankings Not answerable, secret formula Create their own distorted reality

  3. In the realm of language and text analysis …

  4. How often do we have the designs we want versus those our algorithms can (easily) make?

  5. In 2007 I was puzzled:

  6. Why Do They Look Like This?

  7. So I did an investigation

  8. Tag Cloud Order: Surprise! • 7 interviewees DID NOT REALIZE alphabetical ordering • “What order are tags shown in?” • hadn’t thought about it • don’t think about tag clouds that way • random order • ordered by semantic similarity • This result was also found by Wattenberg et al. 2008

  9. Main Reasons For Using: • To signal the presence of tags on the site • An inviting way to get people interacting with the site • A good way to get the gist of the site • Easy to implement

  10. New Perspective: Tag Clouds are Social! • It’s not about the “information”! • Self-reflection • Showing off topics to others, socially. • Probably a fad.

  11. Ten years later …

  12. Word Size Variations in Word Clouds are Problematic:Even with few words in the cloud, the relative values are difficult to perceive. Jonathan Schwabish, http://www.allanalytics.com/author.asp?section_id=3072

  13. What is this a summary of?

  14. Answer: Hamlet’s famous “to be or not to be” soliloquy. But you couldn’t tell. Why not?

  15. Yang et al. Euroviz 2018

  16. “The commonly used trick of scaling by the square root of the word’s weight (to compensate for the fact that words have area, not just length) simply makes a Wordle look boring.” “There’s not much evidence that [tag clouds are] all that useful for navigation or other interactive tasks. … Once I decided to build a system for viewing text rather than tags, it seemed superfluous to have the words do anything other than merely exist on the page. I decided I would design something primarily for pleasure.” “Color means absolutely nothing in Wordle.” it is used for contrast and aesthetics. Feinberg on Wordles Some of Wordle’s success is due to its “its one-paste /one-click instant gratification.” Feinberg, Ch 3, Beautiful Visualization, 2010

  17. Feinberg, Wattenberg, and Viegas 2009 surveyed 4,306 Wordle users and found: • 50% did not understand what font size indicated • 57% wrote the text they visualized • Color “often” interpreted as having meaning • Other Studies find: • Varying font size detrimental to understanding statistics • Font size can guide visual search for certain tasks, but users prefer search boxes for word lookup tasks • Column layouts or bar charts are better for recognizing frequencies of values Other Studies

  18. Why Are They Used Generally? • Word clouds are easy to make. • Word clouds are visuallyengaging. • Word clouds are commonly used.

  19. Word Clouds continue to be used as evidence in scientific settings. Why This Matters

  20. Presented at Vis 2018 The word cloud “shows a summary of tweets” Urban Space Explorer: A Visual Analytics System for Urban Planning , Karduni et al., IEEE CG&A 2018

  21. Presented at Vis 2018 “We see this distribution covers a variety of ethnic surnames, perhaps giving insight into how immigrants migrated after coming to Ellis Island.” Name Profiler Toolkit, Wang et al., IEEE CG&A 2018

  22. Presented at ACL 2018 for 28 seconds. “Here we find differences among the words in large letters. We find for example, learning networks and embeddings being heavily represented in ACL 2018 titles.”

  23. We wouldn’t plot numerical axes incorrectly. Why is it ok to show text in this way? Why?

  24. Why Are They Used in Science? • Word clouds are easy • Word clouds are visually engaging. • Word clouds are commonly used. • There is no alternative with the same properties. • Training in usability is generally lacking. • Also …

  25. Almost any text outcome can look ok: People are great at making up associations among words. It’s hard to conjure what isn’t there: People are really bad at noticing what is missing from text collections. www.randomlists.com

  26. Let’s return to this example:

  27. What are alternatives?

  28. These are accurate, but do not make the words as prominent. An approximate alternative …

  29. New work Goal: retain the engaging aspect of word clouds, while imparting some useful semantic information. Hearst et al, An Evaluation of Semantically Grouped Word Cloud Designs, under review, TGCV

  30. Organizing the words both semanticallyand visuallywill improve understanding while retaining engagement. Hypothesis

  31. Evaluating Word Clouds • Most papers are vague about this • “gist”, “summary”, “navigate”, “see trends” • Most evaluations do not assess these; instead: • Identify the largest word • Identify a given word • How to evaluate more deeply?

  32. Given a set of words, identify the category A New Task menu waiter dishes tablecloth bill restaurant

  33. Compare Effects of Spacing and Color

  34. Hypothesis: standard wordle worst Color + space best Mixed color, but with coherent color assignments, falls in between. Results were consistent with hypothesis. 88% preferred the column layout for task. Average score (out of 5)

  35. White Space Separation Color Mapped, Spatial Jumble Color Mapped, Spatial Organized All views had larger font size variation than prior study

  36. Hypothesis was that column layout would outperform Spatial Organized. Column layout scored best; S.O. significantly better than Spatial Jumbled (Wordle) but not significantly different from Column. Average score (out of 5) 90% preferred color column for “task” 56% preferred color column for “visually pleasing” With the rest split between the other two.

  37. Which of the following do you prefer?

More Related