1 / 35

Spam 2.0 Workshop on Digital Social Networks

Spam 2.0 Workshop on Digital Social Networks. George Petre – glpetre@bitdefender.com Alexandru Cosoi – acosoi@bitdefender.com. Social Networks.

jmcewen
Download Presentation

Spam 2.0 Workshop on Digital Social Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Spam 2.0 Workshop on Digital Social Networks George Petre – glpetre@bitdefender.com Alexandru Cosoi – acosoi@bitdefender.com

  2. Social Networks A social network is a social structure made of nodes (which are generally individuals or organizations) that are tied by one or more specific types of interdependency, such as values, visions, idea, financial exchange, friends, kinship, dislike, conflict, trade, web links, sexual relations, disease transmission (epidemiology), or airline routes. The resulting structures are often very complex. (Wikipedia)

  3. We will talk about…. • Social networks – an introduction • Actual context and issues debated on this subject • Review of primary types of social networks spam • Explore possibilities….

  4. Current Work • Is Britney Spears Spam? – Aaron Zinman, Judith Donath – Sociable Media Group, MIT Media Lab, CEAS 2007 • A learning Approach to Spam Detection based on Social Networks – Ho-Yu Lam, Dit-Yan Yeung – Department of Computer Science and Engineering, Hong Kong University of science, CEAS 2007 • Social Networks and Aggressive behavior: Peer Support or peer rejection – Robert B. Cairns, Beverley D. Cairns, Holly J. Neckerman, Scott D. Gest, Jean Louis Gariepy, Developmental Psychology, 1988 • Several other scientific and non-scientific (including newspapers and blog posts) in this field

  5. Britney Spears • 2 independent dimensions: sociability and promotion. • SNS spam definition – it depends on the user preferences • Based on the two dimensions, they tried to identify some key profiles • Detection based more on profiles and less on comments

  6. Identified Profiles High Sociability High sociability and low promotion. Such a rating is indicative of normal social-oriented humans. They connect and communicate with their social network on a personal level by posting pictures of themselves with their friends, results of random pop quizzes, and publicly host a suite of personal comments posted by their friends. High sociability and high promotion. Besides, the strong marketing orientation of his actions, this prototype of user also engages in individual interaction with network’s members. This a rational approach sustained by a very powerful determination, most often economic (e.g. small or medium companies which attempt to increase their awareness, MLM members, etc). Low promotion High promotion Low sociability and high promotion. This is typical of a promotional entity using SNS as a marketing opportunity. They only broadcast uniform information to their network, while simultaneously trying to expand its membership as much as possible. Examples include Britney Spears (who does not communicate individually with their members), a Viagra ad and a pornographic webcam. Low sociability and low promotion. This user might be a new member to the site, or might be a low-effort spammer who does not care about posing as something real. Without information to judge, they cannot tackle their classification. Low sociability

  7. They concluded that… • Users can (should) be assisted by an AI engine when they interact with other users • Only users can decide if “Britney Spears” is spam (for them) • Robots (automatic generated profiles) can be tracked computationally • Machine learning techniques • It is quite difficult to classify profiles into legit or dubious • Huuuge grey zone

  8. Rolex Replica (cool for teens) • Very legitimate robot • A looooooooot of friends (3000) • SEO purpose • Friendly comments • Same comment over and over again • The advertised web site has a Google page rank of 4 (!!!!) • Spam websites usually have 0 points page rank VOTE

  9. Viagra ad • YouTube Viagra ad (the cheap stuff!!!!) • Hyperlink flashing in the movie • May be legit, but also it may sell fake Viagra () VOTE

  10. Porn Spam (I) • Many many many keywords • YouTube policy on porn • Using social networks to increase trust and ranking • Not easy to classify -> grey zone? VOTE

  11. Porn Spam (II) • Again, many many keywords • Porn industry profiles (could be spam for some and a lot of fun for others) • If a friend of a friend is a top friend and also a porn star, is it spam for you? VOTE

  12. Porn Spam (III) • Comments advertising porn • Some consider these comments as spam • Direct spam and sometimes SEO VOTE

  13. Porn Spam (IV) • Is this SPAM? • This is NOT a movie • The destination website could contain vulnerabilities, could be phishing, advertising cheap meds, and so on. VOTE

  14. Inch++ comments • Legit profile, with a spam comment from a legit friend. • Same comments over and over again – different “legit” profiles • Copy paste this URL please! VOTE

  15. Obfuscations • hey my frieMnd saw your profitle and thinuks you loMokhodt! she is new to mqyspwace but wants to chcat with you on ms0n mesksenger her name on there is emily21bath@hotmail.com • <br>hey my frie<font point-size="0pt">M</font>nd saw your profi<font point-size="0pt">t</font>le and thin<font point-size="0pt">u</font>ks you lo<font point-size="0pt">M</font>ok ho<font point-size="0pt">d</font>t! she is new to m<font point-size="0pt">q</font>ysp<font point-size="0pt">w</font>ace but wants to ch<font point-size="0pt">c</font>at with you on ms<font point-size="0pt">0</font>n mes<font point-size="0pt">k</font>senger her name on there is emily21bath@hotmail.com • </td> <br> hey my friend saw your profi<font point-size="0pt">T</font>le and thin<font point-size="0pt">S</font>ks you look ho<font point-size="0pt">r</font>t! she is new to mysp<font point-size="0pt">p</font>ace but wants to chat with you on ms<font point-size="0pt">Z</font>n mes<font point-size="0pt">F</font>senger her name on there is emily21bath@hotmail.com </td> VOTE

  16. Image Spam • Might not be spam, BUT when 4 consecutive comments form different legit users advertise this software….. VOTE

  17. Google Redirect • Can this NOT be spam? • <A HREF=http://www.google.com.au/url?q=http://trackme.19.fo%72%75%6D%65%72%2E%63%6F%6D%2F%69%6E%64%65%78%2E%70%68%70> <FONT SIZE=5><FONT COLOR=blue>Click here to get to the website that has the myspace profile tracker </a> <br /><p> VOTE

  18. Phishing • If you want to see my picture, you must log in first…. Right on this page  VOTE

  19. Types of spam / SN (I) • 3 types of Social Networks • Social Network type A – targets mainly teenagers • Social Network type B – targets mostly teenagers, but not entirely • Social Network type C – targets any user (no age or sex differentiation) *This classification was made by randomly checking a few (hundreds) profiles on several social networks

  20. Types of spam / SN (II)

  21. Profile Gatherers • Low-Medium promotion • Sociability = just adding new friends • Short description and too much friends. • Botnet? Latent Spammer?

  22. Mitigating profiles • Legit Profile • Legit comments • A lot of friends • Posting on spammy profiles • Direct legit testimonials

  23. How to create a “spammer profile”?(I) • Step I: Google search for “@a_big_free_email_provider” on myspace website … and extract the email addresses returned

  24. How to create a “spammer profile”?(II) Step II: Use your favorite free e-mail provider and import an address book format file

  25. How to create a “spammer profile”?(III) Step III: Use the “import contacts from your email account” for your free email account, enter the captcha and start spamming…

  26. Acceptance • 5 out of 10 “add me” requests are approved on IM • 7 out of 10 “add me” requests are approved in SNS • Usually comments are on a “accept all” basis

  27. Automatic Profile Categorization • A number of quantifiers can be obtained • Machine learning techniques (self organizing) • Provide assistance for the user at friendly profile approval • We propose ART, SOFM, KNN and other clustering techniques

  28. Input Features • Frequency of the invitations (in some SNS) • All features from “Is Britney Spears Spam” paper • Semantic differences or similarities between comments (concepts, hyper concepts – we propose LSA, Bayesian or CNG) • Semantic differences or similarities between profiles

  29. Experimental Data • Bayesian Filter from BitDefender Parental Control Module – trained for EMAIL spam (several semantic categories – the ones you wouldn’t like your kid to see) • As output, the system returns the probability for each category – we used all these values in the clustering algorithm • Not exactly fair, since we are emphasizing only the dirty details. • Many many clusters…. So many that it was really hard to analyze

  30. Clusters • Sparse Clusters • Condensed clusters • Automated generated profiles • Groups with similar interests

  31. Results • We found hundreds of similar machine generated profiles (with different number of friends, and posting comments on each other’s profiles) • We found more than 500 profile gatherers (a few days ago, we could easily search for profiles with a range of 300 000 – 500 000 friends. This search option is not allowed anymore) • Mitigating profiles are the most hard to find, but we managed to analyze a few

  32. Social Networks Ranking • Cluster analysis • Number of Profile gatherers • Number of users • Number of spammy comments / randomly chosen profiles • Weighted average with the presented indicators

  33. Accept Invitation Assistance • This profile is interested of the following concepts • This profile is spammed • This profile has spammy posts • This user was found in the following clusters – might be a (profile gatherer, mitigating profile, marketing profile…..) • Client based

  34. Conclusions • We also agree that this is a highly difficult task • In most of the cases, it is impossible to say for sure that it is a spammy profile – depends on the user’s preferences. • SNS’s are a good starting point for email spam – thousands of email addreses

  35. Conclusions (II) • …….

More Related