170 likes | 346 Views
Spam filtering in MMOs. Presented by: Nandor T. Szots Senior Programmer, EverQuest II nszots@soe.sony.com Sony Online Entertainment. Overview. Introduction Why is spam such a large problem in MMOs Technical details of EQII’s filtering solution MMO specific filtering techniques.
E N D
Spam filtering in MMOs Presented by: Nandor T. Szots Senior Programmer, EverQuest II nszots@soe.sony.com Sony Online Entertainment
Overview • Introduction • Why is spam such a large problem in MMOs • Technical details of EQII’s filtering solution • MMO specific filtering techniques
Introduction • An MMO with 100k-500k subscribers can expect between 10,000 to 1 million spam messages to be sent to it’s players per day. • Helps to compromise fair game play • Gives an advantage to players willing to use third-party RMT sites. • Leads to loss of subscriptions • Generates no revenue for the developers
Why is spam such a large problem in MMOs • Third-party RMT is estimated to be a $2 billion / year business • 30% of MMO players are estimated to be involved as either buyers or sellers • Unlimited resources generated by the game • Access to resources using repetitive and scriptable tasks
Technical details of EQII’s filtering solution Naive Bayes Classifier • Each word is considered independently of the phrase • Allows for a simple DB • Uses a simple formula to determine spam probability • ln( p(S) / p(¬ S) ) in the case of MMOs can be dynamic based on player/account information
Technical details of EQII’s filtering solution Creating a DB of words • EQII DB • ~122,000 lines which is ~1 million words of good data • ~200 lines which is ~4,500 words of bad data • Much fewer types of spam messages creates this delta • Large ‘good’ data set makes it harder for spammer to pollute your data • Glue words ‘the’, ‘a’, ‘an’, ‘it’ etc. can be ignored • Words should only count once in any statement • Good idea to split out currency, game currency, etc. before processing
Technical details of EQII’s filtering solution Typical spam message examples from EQII • Typical messages pre-spam filtering: “Safe Gold,Fast power leveling,Special service of gold farming. Enjoy the fun we give you and your guild at www.spamworker.com” • Typical spam message today: “Stick to buy 1000 gold per-day for 5214 weeks on our site, you will be 100 years old, and become the strongest person in the game forever!” “Visit \/\/w\/\/.s###w0rk3r.c0/\/\, ###=pam, 3=e”
Technical details of EQII’s filtering solution Examples of applying the filter to real-world messages • Example DB and test phrases can be found on my website: http://nandor.szots.com • We will use two phrases with similar wording one which is spam and one which is not and show how the filter processes each
Technical details of EQII’s filtering solution Non-Spam message: “Hello, what’s up? Did you see how gold the sun was? Lets go power level!” • Filter is passed the phrase as follows: “hello what’s up did you see how #GAMECUR# the sun was lets go power level”
Technical details of EQII’s filtering solution • Filter processes each word: • Final value: -4.97 => Not Spam!
Technical details of EQII’s filtering solution Spam message: “Hello! Welcome to www.buygold.com. Power leveling, and fast safe gold!” • Filter is passed the phrase as follows: “hello welcome to www buygold com power leveling and fast safe#GAMECUR#”
Technical details of EQII’s filtering solution • Filter processes each word: • Final value: 28.45 => Spam!
MMO specific filtering techniques Techniques to allow cross-game implementation • Keyword replacements • In-game currencies • Genre-specific concepts • Real world monetary values • Dynamic modification of starting probabilities • Source and Destination filtering capabilities
MMO specific filtering techniques Looking at our probability one more time: • Need to make changes that do not effect mathematical correctness of the formula. • Best approach is to modify: ln( p(S) / p(¬ S) ) • Account Age • Account Type / Payment Type • IP Address • Character Age • Character Level • These should all be computed from real-world values like the word DBs
MMO specific filtering techniques MMO Advantages • Use your community to your advantage • Allow for real-time marking of spam and non-spam messages • Update word DBs frequently (once an hour or more) • Lots of additional information about message senders and reporters • Social relationships (guilds, groups, friends, etc.)
MMO specific filtering techniques MMO Disadvantages • Spammers can pre-test their messages • Minimize this by using ip, account, billing, etc. information • Spammers can try to weight their messages down by reporting them as not-spam • Create strict rules for accepting player reports to minimize spammers ability to weight your filter • It is important to create as much misdirection as possible
Q&A Questions & Answers