190 likes | 393 Views
Alethiometer : a framework for assessing trustworthiness and content validity in social media. Eva Jaho , Efstratios Tzoannos, Aris Papadopoulos, Nikos Sarris. Motivation and challenge. 5 Vs of Big Data. 3 Cs of Veracity. C. C. C. ontributor. ontent. ontext. Alethiometer framework. 3.
E N D
Alethiometer: a framework for assessing trustworthiness and content validity in social media Eva Jaho, Efstratios Tzoannos, Aris Papadopoulos, Nikos Sarris
Motivation and challenge 5 Vs of Big Data 3 Cs of Veracity
C C C ontributor ontent ontext Alethiometer framework 3
C1 Contributor What can we find out about the source of information? 4
Contributor modalities • Reputation- Analyse comments in the course of time, discover sentiments and opinions towards a source.- Measured by the number of upvotes or likes. • History- Information about activity on different social media platforms, combined with validity data.- Measured by the update frequency of valid posts. • Popularity- Information about following source activity (readings, recommendations).- Measured by the number of friends/followers, and the number of responses. 5
Contributor modalities • Influence- Information about activities triggered by this source (re-posts, discussions or comments).- Measured by number of retweets/shares, Klout influence score. • Presence- Information about type of source (individual, organisation,officiallyverified account, fake identity, etc.) and its presence on multiple social media platforms.- Measured by the number of accounts in different social media. 6
C2 Content Does the posted content look reliable? 7
Content modalities • Reputation of linked web content- Measured in terms of domain reputation, page rank (GoogleRankor Alexa PageRank), or properties of the contributors to the content. • Provenance- Finding the original occurrence of the content and its whole path across sources, places and time, and measuring the reputation of these sources. • Popularity- Information about how many people are following this content.- Measured by the number of followers, and the number of responses. 8
Content modalities • Influence- Analyse if this content is triggering discussions or other actions in the social sphere.- Measured by number of retweets/shares. • Originality- Check whether the content or parts thereof have been used in the past (e.g., reused text or images that have appeared in the past). • Authenticity- Check whether the content has been changed with respect to its original state (e.g., changed text or attached multimedia content) • Objectivity and Diversity- Measured by the variation of opinions found for people, content, or general entities. 9
C3 Context Does the 'what', 'when' and 'where’ stick together? 10
Context modalities • Cross-checking- Measured by the number of different reports or mentions about the same thing coming from independent sources • Coherence- Measurement of text coherence (e.g., Coh-Metrix) and coherence between the content and tags, attached web-links, or attached multimedia. • Proximity- Measurement of coherence between reference location/time andpublication location/time. 11
Approach for rating of modality parameters Rate parameters on 5-point discrete scale, from 0 to 4- [0, a0) → 0, [a0, a1) →1, [a1, a2) → 2, [a2, a3) → 3, [a3, ∞) → 4.- a0: 20th percentile, a1: 40th percentile, a2: 60th percentile, a3: 80thpercentile (adjust the scale so it follows a uniform distribution).Weight the rating of parameters for deriving a total score uniformly or based on their significance 13
Preliminary statistical results • Parameters studied • Number of followers • Number of tweets • User account age • Sample: ~10 M tweets, 5 K users • Collection period: July-September 2013 15
Empirical distributions Heavy-tailed distributions Multimodal heavy-tailed distributions with three different peaks (6.7 months, 23.3 months, 4.4 yrs) 16
Correlation coefficients • Friends - followers: 0.1222 • Friends - tweets: 0.08 • Followers - tweets: 0.0197 • Conclusion:- all parameters relatively independent from one-another- need to be studied independently 17
Summary and future work • Summary • Defined Alethiometer: a framework taking into account all aspects: Contributor, Content and Context • Showed an approach for combining the ratings of all parameters • Attested the relative independence of parameters and the need to consider a variety of measures (also previously emphasized in the literature) • Future work • Investigate statistical properties of other modalities • Extract the significance of modalities • Study correlation between content, contributor and context modalities 18 18
Thank you Questions & Answers • find us at http://ilab.atc.gr • follow us @iLabATC e.jaho@atc.gr