740 likes | 751 Views
Explore the impact of social media on mass communication, focusing on malicious URLs, security threats, and the behavior of attackers. Learn about potential targets, campaign impact, and the challenges faced in combating online threats. This analysis provides insights into the future of safeguarding against online attacks.
E N D
Attacking Strategies Analysis on Social Media Chun-Ming Lai Computer Science, University of California, Davis
Social Media • Exerting significant impact on mass communication
Traditional communication Authoratative
Social Media Distributed
Facebook.com/63811549237/posts/10153038271604238 2014,12-19,03:06am
Major Dimensions Likely offender (Attacker Bahavior) • Malicious URLs • Facebook Social Media Dataset • Targets / Environments /Impact of campaigns • Attackers digital footprints The absence of capable guardians (potential audience) Suitable Targets (Targets posts, pages)
Security Threat • Severe Threat • Phishing • Malware, drive-by-download • Medium to light Threat • Advertisement • Spamming (Fund-raising, porn, canned messages, etc.) • New type Threat • Rumors,Mediamanipulation,signup,votestuffing,etc. • Fake News • Crowdturfing = CrowdSourcing + Astroturfing
Difficulty & Challenge • Heterogeneous and huge data • Text, media, transaction, etc. • Labeled Data is precious • Different Criteria • Data size and type • New Patterns of Online Service • Application Bursts, Facebook Live, Game, etc.
Suitable Targets (Targets posts, pages) Hopefully Contribution (3W1H) • Where ?? • US, Middle East, Asia, etc. • Politics, sports, entertainment, etc. • How efficiency ? • Audience, User experience, etc. • Search Engine Spam, phishing, social media manipulation, sign up, etc. • Who ? • Fake, net army, compromised, etc. • What are these Malicious URLs for ? The absence of capable guardians (potential audience) Likely offender (Attacker Behavior)
Distributed, trustworthy Distributed
Outline • Introduction • Related Work & Evaluation Tools • Suitable Targets • Potential Audience • Attackers Behavior • Future Work
Related work • Context Filter (V. Balakrishnan 2016, C. Grier 2010, G. Stringhini 2010 ) • Blacklists • Text structure & pattern • User-profile(K.Lee,2010) • Geography, personal info. • created (updated) time • profile pictures
Related Work (cont’) • Behavior-driven signal(C.Cao2015,G.Wang2013) • Clicks • Likes • Shares • Network-based (B.Viswanath2010) • Edge: friend, like similarity, etc. • Static or dynamic Margin groups • Find one, and clustering • Combine 4 categories to do so
Evaluation Tools • VirusTotal • API, 60+ security engine support, • Avira, Kapersky, Google Safebrowsing, etc. • URLBlacklists • File based, 100+ categories, 10,000,000 + domain • Ads, porn, drug, weapon, etc.
Sorted blacklists Sorted url_parsed with prefix Labeled Data Black.com Black1.net Phish.com … …. …. d.Com c.d.com b.c.d.com a.b.c.d … …
Outline • Introduction • Related Work & Evaluation Tools • Suitable Targets • Potential Audience • Attackers Behavior • Future Work
Suitable Targets Problem • Any post thread p in social media platform, predict whether p contains at least one malicious comment via a classifier – c {target,nontarget}
Popularity • Attention is everything !!! • Avg. Time: FB/ 50 mins, sports/ 17 mins [FB / NYT] • Liking, commenting, sharing, reading, etc. • Interdisciplinary Works – Economy, advertisement, communication • Output: tweets counts, FB shares / comments, total clicks, etc. • Input: content, topic, number of comments after a short time, etc. • Theory: Information Cascade, bandwagon effect, attention economy, etc. • Reference:(A.Tatar,2011),(C.Castillo,2010),(K.Wang,2015)
Definition • Time Series (TS) • TScreated(post): the time an original article is posted • TSj: a time period j following the time of the original • TSfinal: the end of our observation • Accumulated Number of participants (AccNcomment) • The number of post comments between TSi and TS(i-1) • Discussion Atmosphere Vector (DAV)
Example • TScreated(Climate) = 2014-12-19 03:06:42 • Suppose j = 5, final = 120 • DAV(Climate) = [# of comments 03:06:42 ~ 03:11:421st # of comments 03:11:42 ~ 03:16:422nd … # of comments 05:01:42 ~ 05:06:42]24th
Dataset Totally 42,703,463 • 2011~2014 Ten Main Media pages on Facebook
Several static features • Spanning time(Shelf-life) • Time(last comment) – Time (post time) • # of comments • Total # of cmts regarding posts • users, likes, etc.
Results NearRealTime
Next question: prefer which stage? • Early • Lead the discussion in the beginning • User Interface • Late • Notification function • New coming Audience • Middle or random • The advantage of two
Discussion (1/2) 9420 comments have been detected, provided by 5026 accounts
Discussion (2/2) Discussion (2/2)
Remarks • Predict Suitable Targets successfully with temporal features • Attackers: Follow or not? • Defenders: Deploy resource • Temporal Analysis with different variables • Stage • Exact time after post created • Time duration between two consecutive malicious comments in the same page
Outline • Introduction • Related Work & Evaluation Tools • Suitable Targets • Potential Audience • Attackers Behavior • Future Work
Why study Effectiveness • Communication is trying to influence others. • Qualitative and quantitative analysis for each mURL. • Risk Assessment and control
Intuitive thinking • How many people have seen/clicked the message? (Directly) • Hard to get entire data since recommending system • Communication • User intention to rejoin • Shelf-live period
Estimate Audience • Action Within in Page G action—comment, like, angry, reaction, etc. T0 (attack) T0 - T0 +
Indirect influence – final comments • Predicting final comments/visits using post’ early stage reaction • Distribution matrix Dij (j participants within i minutes) • Prediction Matrix Mij
Example • 4 Postswith final comments: • A (100), B (101), C(102), D(2) • D56 = {A,B,C} • Input a post E got 6 comments within first 5 minutes • Probably > 100 (lower bound) • ~90% accuracy
Some future work • More accurate prediction • > 100 v.s. 100~200 • Pick “popular ” from Non-Target • Some pages have lots of low popularity posts Target posts Non-Target posts
Remarks • Direct Estimation • Twindow, , hundreds of audiences will be influenced • Indirect Estimation • Impact to life cycle (even popular)
Outline • Introduction • Related Work & Evaluation Tools • Suitable Targets • Potential Audience • Attackers Behavior • Future Work
Work Review Social Media Manipulation Sign up Search Engine Spamming Vote Stuffing • Network-based • Static: Margin • Dynamic: Deviation • Behavior, profile based • No or google images • Anomaly Detection • Notjustclassification • Fake,compromised
Accounts other activities • From previous experiment, 5026 malicious accounts were identified • 40,000 + pages on Facebook (2011-2016) • >70% accounts don’t have “like” • Like is easier 9420 comments have been detected, provided by 5026 accounts
Accounts footprints • Response time to post thread • Ten comments to ten different articles • Remain online to “lead’ discussion Commenting time Vector = Vote Stuffing
Normal v.s. Malicious accounts • Malicious accounts like to comment in the late • Legitimate accounts commits after a fixed time from original article
Same content, multiple accounts • One message, multiple accounts (red) • One account, same but different post threads (green)
Outline • Introduction • Related Work & Evaluation Tools • Suitable Targets • Potential Audience • Attackers Behavior • Future Work