180 likes | 387 Views
Best Student Paper, ACM Sigcomm 2006 Anirudh Ramachandran and Nick Feamster cc@gatech Ye Wang (sando). Understanding the Network-Level Behavior of Spammers. Content. Motivation Data Collection Data Analysis Network-level Characteristics of Spammers Spam from Botnets
E N D
Best Student Paper, ACM Sigcomm 2006 Anirudh Ramachandran and Nick Feamster cc@gatech Ye Wang (sando) Understanding the Network-Level Behavior of Spammers
Content • Motivation • Data Collection • Data Analysis • Network-level Characteristics of Spammers • Spam from Botnets • Spam from Transient BGP Announcements • Lessons for Better Spam Mitigation • Conclusion & Discussion
Motivation • Scalability, Security, Reliability, Operability • keys of next generation Internet service • Internet business model stands on them • then performance, increase services, applications • large amount of funding tells this secret • Security issue is tough • Attackers always win! • spam, botnet, DDoS, worm, probe, hijack, crack, phishing
Motivation • Spam (and Mitigation) • eat bandwidth, degrade email service, complications • direct, open relays, botnets, spectrum agility • content filter (large corpuses for training) • IP blacklist (IP-layer behavior is not clear) • Target of this 18-month project • characterize the network-level behavior of spammers • IP address, AS, country of spammers • IP-layer techniques of spammers: botnets, routing • give some guideline for better mitigation
Data Collection • Spam Email Traces • a “sinkhole” corpus domain • Aug. 5, 2005 – Jan. 6, 2006 • 10,000,000 spams • collect network-level properties of spams • IP address of the relay • traceroute • passive “p0f” TCP fingerprint (indication of OS) • whether the relay in the DNS blacklists
Data Collection • Legitimate Email Traces • from a large email service provider • *Nick is always welcome • 700,000 legitimate emails • Botnet Command and Control Data • a trace of hosts infected by W32/Bobax worm • redirect DNS queries to the sinkhole running botnet command and control • BGP Routing Measurements • BGP monitor • just like our rumor-collector
Data Analysis • Network-level Charateristics of Spammers • Distribution across IP address space • Majority spam from a small fraction of IP • Spammers quite distributed
Data Analysis • Network-level Charateristics of Spammers • Distribution across ASes and by country • (spam and legitimate) 10% from 2 ASes; 36% from 20 Ases
Data Analysis • Network-level Charateristics of Spammers • The Effectiveness of Blacklists • 80% relays in the blacklists
Data Analysis • Spam from Botnets • Bobax vs spammer distribution • 4693/117,268 Bobax bots sent spam; but similar CDF of IP address for spamers and Bobax dones
Data Analysis • Spam from Botnets • OS of spamming hosts • 4% not Windows; but sent 8% spam
Data Analysis • Spam from Botnets • Spamming Bot Activity Profile • 65% single-shot bots; 75% sent less than two
Data Analysis • Spam from Transient BGP Announcements • BGP Spectrum Agility • hijack /8 send spam withdraw • 66./8 of AS21562, 82./8 of AS8717, (61./8 of AS4678)
Data Analysis • Spam from Transient BGP Announcements • How much spam from Spectrum Agility • 1% spam from short-lived routes; but sometimes 10% • Prevalence of BGP Spectrum Agility • Persistence != Volume • AS4788, AS4678
Lessons for Better Spam Mitigation • Spam filtering requires host identity • Detection based on aggregate behavior is better than single IP address • Securing the Internet routing infrastructure bolsters identity and traceability of emails • Network-level properties incorporated into spam filters may be effective
Conclusion • Methodology • joint analysis of a unique combination of datasets • strong hacking techniques • *only Nick can handle that easily • measurement based study • Contribution • important results of spammers’ network-level behavior • network-level properties are less malleable • network-level properties may be observable at a early stage • defense guidelines and lessons
Discussion • We could learn much from this paper • research motivation must be strong • significance of Routing Management, IVI, CGENI? • employ diversified techniques to enrich the methodology • arbitrary conclusion should be avoided • Some questions • the problem itself is far beyong being solved • still some arguable data (botnets) in the paper • spamming reveals in return the defect of email service itself and the design of its business model (pay for spam?)
All big things in this world are done by people who are naïve and have an idea that is obviously impossible. ---- Frank Richards Thank You