710 likes | 924 Views
Real time analysis and visualization. Anubisnetworks labs PTCoresec. Agenda. Who are we? AnubisNetworks Stream Stream Information Processing Adding Valuable Information to Stream Events. Who are we ?. João Gouveia AnubisNetworks @ jgouv. Tiago Henriques Centralway @ Balgan.
E N D
Real time analysis and visualization Anubisnetworkslabs PTCoresec
Agenda • Who are we? • AnubisNetworksStream • StreamInformationProcessing • AddingValuableInformation to StreamEvents
Who are we? • JoãoGouveia • AnubisNetworks • @jgouv • Tiago Henriques • Centralway • @Balgan • Tiago Martins • AnubisNetworks • @Gank_101
AnubisStreamForce • Events (lots and lots of events) • Events are “volatile” by nature • They exist only if someone is listening • Remember?: “If a tree falls in a forest and no one is around to hear it, does it make a sound?”
AnubisStreamForce • Enter security Big Data “a brave new world” Volume We are here Variety Velocity
AnubisStreamForce • Problems (and ambitions) to tackle • The huge amount and variety of data to process • Mechanisms to share data across multiple systems, organizations, teams, companies.. • Common API for dealing with all this (both from a producer and a consumer perspective)
AnubisStreamForce • Enter the security events CEP - StreamForce • High performance, scalable, Complex Event Processor (CEP) – 1 node (commodity hw) = 50k evt/second • Uses streaming technology • Follows a publish / subscriber model
AnubisStreamForce • Data format • Events are published in JSON format • Events are consumed in JSON format
AnubisStreamForce • Yes, we love JSON
AnubisStreamForce Sharing Models
MFE OpenSource / MailSpikecommunity Dashboard Dashboard Real Time Feeds ComplexEventProcessing Real Time Feeds IP Reputation Twitter Sinkholes Data-theft Trojans Traps / Honeypots Passive DNS
MFE OpenSource / MailSpikecommunity Dashboard Dashboard Real Time Feeds ComplexEventProcessing Real Time Feeds IP Reputation Twitter Sinkholes Data-theft Trojans Traps / Honeypots Passive DNS
AnubisCyberFeed • Feed galore! • Sinkhole data, traps, IP reputation, etc. • Bespoke feeds (create your own view) • Measure, group, correlate, de-duplicate .. • High volume (usually ~6,000 events per second, more data being added frequently
MFE OpenSource / MailSpikecommunity Dashboard Eventnavigation Real Time Feeds ComplexEventProcessing Real Time Feeds IP Reputation Twitter Sinkholes Data-theft Trojans Traps / Honeypots Passive DNS
AnubisCyberFeed • Apps (demo time)
StreamInformationProcessing • Collecting events from the Stream. • Generating reports. • Real time visualization.
Challenge • ~6k events/s and at peak over 10k events/s. • Let’s focus on trojans feed (banktrojan). • Peaks @ ~4k events/s {"_origin":"banktrojan","env":{"server_name":"anam0rph.su","remote_addr":"46.247.141.66","path_info":"\/in.php","request_method":"POST","http_user_agent":"Mozilla\/4.0"},"data":"upqchCg4slzHEexq0JyNLlaDqX40GsCoA3Out1Ah3HaVsQj45YCqGKylXf2Pv81M9JX0","seen":1379956636,"trojanfamily":"Zeus","_provider":"lab","hostn":"lab14","_ts":1379956641}
Challenge • Let’s use the Stream to help • Group by machine and trojan • From peak ~4k/s to peak ~1k/s • Filter fields. • Geo location • We end up with {"env":{"remote_addr":"207.215.48.83"},"trojanfamily":"W32Expiro","_geo_env_remote_addr":{"country_code":"US","country_name":"UnitedStates","city":"Los Angeles","latitude":34.0067,"longitude":-118.3455,"asn":7132,"asn_name":"AS for SBIS-AS"}}
Challenge • How to process and store these events?
Technologies • Applications • NodeJS • Server-side Javascript Platform. • V8 Javascript Engine. • http://nodejs.org/ • Why? • Great for prototyping. • Fast and scalable. • Modules for (almost) everything.
Technologies • Databases • MongoDB • NoSQL Database. • Stores JSON-style documents. • GridFS • http://www.mongodb.org/ • Why? • JSON from the Stream, JSON in the database. • Fast and scalable. • Redis • Key-value storage. • In-memory dataset. • http://redis.io/ • Why? • Faster than MongoDB for certain operations, like keeping track of number of infected machines. • Very fast and scalable.
Data Collection Stream • Applications • Collector • Worker • Processor • Databases • MongoDB • Redis Collector
Data Collection • Events comes from the Stream. • Collectordistributes events to Workers. • Workerspersist event information. • Processoraggregates information and stores it for statistical and historical analysis. Stream Collector
Data Collection • MongoDB • Real time information of infected machines. • Historical aggregated information. • Redis • Real time counters of infected machines. Stream Collector
Data Collection - Collector • Old data is periodically remove, i.e. machines that don’t produce events for more than 24 hours. • Decrements counters of removed information. • Send warnings • Country / ASN is no longer infected. • Botnet X decreased Y % of its size. • Send events to Workers. Workers
Data Collection - Worker • Create new entries for unseen machines. • Adds information about new trojans / domains. • Update the last time the machine was seen. • Process events and update the Redis counters accordingly. • Needs to check MongoDB to determine if: • New entry – All counters incremented • Existing entry – Increment only the counters related to that Trojan • Send warnings • Botnet X increased Y % in its size. • New infections seen on Country / ASN.
Data Collection - Processor • Processorretrieves real time counters from Redis. • Informationisprocessedby: • Botnet; • ASN; • Country; • Botnet/Country; • Botnet/ASN/Country; • Total. • Persisting information to MongoDBcreates a historic databaseof counters that can be queried and analyzed.
Data Collection - MongoDB • Collection for active machines in the last 24h { "city" : "Philippine", "country" : "PH", "region" : "N/A", "geo" : { "lat" : 16.4499, "lng" : 120.5499 }, "created" : ISODate("2013-09-21T00:19:12.227Z "), "domains" : [ { "domain" : "hzmksreiuojy.nl", "trojan" : "zeus", "last" : ISODate("2013-09-21T09:42:56.799Z"), "created" : ISODate("2013-09-21T00:19:12.227Z") } ], "host" : "112.202.37.72.pldt.net", "ip" : "112.202.37.72", "ip_numeric" : 1892296008, "asn" : "Philippine Long Distance Telephone Company", "asn_code" : 9299, "last" : ISODate("2013-09-21T09:42:56.799Z"), "trojan" : [ "zeus” ] }
Data Collection - MongoDB • Collection for aggregated information (the historic counters database) { "_id" : ObjectId("519c0abac1172e813c004ac3"), "0" : 744, "1" : 745, "3" : 748, "4" : 748, "5" : 746, "6" : 745, ... "10" : 745, "11" : 742, "12" : 746, "13" : 750, "14" : 753, ... "metadata" : { "country" : "CH", "date" : "2013-05-22T00:00:00+0000", "trojan" : "conficker_b", "type" : "daily" } } Preallocated entries for each hour when the document is created. If we don’t, MongoDB will keep extending the documents by adding thousands of entries every hour and it becomes very slow.
Data Collection - MongoDB • Collection for 24 hours • 4 MongoDB Shard instances • >3Million infected machines • ~2 Gb of data • ~558 bytes per document. • Indexes by • ip – helps inserts and updates. • ip_numeric – enables queries by CIDRs. • last – Faster removes for expired machines. • host – Hmm, is there any .gov? • country, family, asn– Speeds MongoDB queries and also allows faster custom queries. • Collection for aggregated information • Data for 119 days (25 May to 11 July) • > 18 Million entries • ~6,5 Gb of data • ~366 bytes per object • ~56 Mb per day • Indexes by • metadata.country • metadata.trojan • metadata.date • Metadata.asn • Metadata.type, metadata.country,metadata.date,met.......(all)
Data Collection - Redis • CountersbyTrojan / Country "cutwailbt:RO": "1256", "rbot:LA": "3", "tdss:NP": "114", "unknown4adapt:IR": "100", "unknownaff:EE": "0", "cutwail:CM": "20", "unknownhrat3:NZ": "56", "cutwailbt:PR": "191", "shylock:NO": "1", "unknownpws:BO": "3", "unknowndgaxx:CY": "77", "fbhijack:GH": "22", "pushbot:IE": "2", "carufax:US": "424“ • Counters by Trojan "unknownwindcrat": "18", "tdss": "79530", "unknownsu2": "2735", "unknowndga9": "15", "unknowndga3": "17", "ircbot": "19874", "jshijack": "35570", "adware": "294341", "zeus": "1032890", "jadtre": "40557", "w32almanahe": "13435", "festi": "1412", "qakbot": "19907", "cutwailbt": "38308“ • Countersby Country “BY": "11158", "NA": "314", "BW": "326", "AS": "35", "AG": "94", "GG": "43", "ID": "142648", "MQ": "194", "IQ": "16142", "TH": "105429", "MY": "35410", "MA": "15278", "BG": "15086", "PL": "27384”
Data Collection - Redis • Redis performance in our machine • SET: 473036.88 requests per second • GET: 456412.59 requests per second • INCR: 461787.12 requests per second • Time to get real time data • Getting all the data from Familys/ASN/Counters to the NodeJS application and ready to be processed in around half a second • > 120 000 entries in… (very fast..) • Our current usage is • ~ 3% CPU (of a 2.0 Ghz core) • ~ 480 Mb of RAM
Data Collection - API • But! There is one more application.. • How to easily retrieve stored data • MongoDB Rest API is a bit limited. • NodeJS HTTP + MongoDB + Redis • Redis • http://<host>/counters_countries • ... • MongoDB • http://<host>/family_country • ... • Custom MongoDB Querys • http://<host>/ips?f.ip_numeric=95.68.149.0/22 • http://<host>/ips?f.country=PT • http://<host>/ips?f.host=\bgov\b
Data Collection - Limitations • Grouping information by machine and trojan doesn’t allow to study the real number of events per machine. • Can be useful to get an idea of the botnet operations or how many machines are behind a single IP (everyone is behind a router). • Slow MongoDB impacts everything • Worker application needs to tolerate a slow MongoDB and discard some information has a last resort. • Beware of slow disks! Data persistence occurs every 60 seconds (default) and can take too much time, having a real impact on performance.. • >10s to persist is usually very bad, something is wrong with hard drives..
Data Collection - Evolution • Warnings • Which warnings to send? When? Thresholds? • Aggregate data by week, month, year. • Aggregate information in shorter intervals. • Data Mining algorithms applied to all the collected information. • Apply same principles to other feeds of the Stream. • Spam • Twitter • Etc..
Reports • What’s happening in country X? • What about network 192.168.0.1/24? • Can send me the report of Y everyday at 7 am? • Ohh!! Remember the report I asked last week? • Can I get a report for ASN AnubisNetwork?
Reports • HTTP API • Schedule • Get • Edit • Delete • List schedules • List reports • Check MongoDB for work. • Generate CSV report or store the JSON Document for later querying. • Send email with link to files when report is ready.
Reports – MongoDBCSVs • ScheduledReport { "__v" : 0, "_id" : ObjectId("51d64e6d5e8fd0d145000008"), "active" : true, "asn_code" : "", "country" : "PT", "desc" : "Portugal Trojans", "emails" : "", "range" : "", "repeat" : true, "reports" : [ ObjectId("51d64e7037571bd24500000d"), ObjectId("51d741e8bcb161366600000c"), ObjectId("51d89367bcb161366600005f"), ObjectId("51d9e4f9bcb16136660000ca"), ObjectId("51db3678c3a15fc577000038"), ObjectId("51dc87e216eea97c20000007"), ObjectId("51ddd964a89164643b000001") ], "run_at" : ISODate("2013-07-11T22:00:00Z"), "scheduled_date" : ISODate("2013-07-05T04:41:17.067Z") } • Report { "__v" : 0, "_id" : ObjectId("51d89367bcb161366600005f"), "date" : ISODate("2013-07-06T22:00:07.015Z"), "files" : [ ObjectId("51d89368bcb1613666000060") ], "work" : ObjectId("51d64e6d5e8fd0d145000008") } • Files • Each report has an array of files that represents the report. • Each file is stored in GridFS.
Reports – MongoDBJSONs • ScheduledReport { "__v" : 0, "_id" : ObjectId("51d64e6d5e8fd0d145000008"), "active" : true, "asn_code" : "", "country" : "PT", "desc" : "Portugal Trojans", "emails" : "", "range" : "", "repeat" : true, “snapshots" : [ ObjectId("521f761c0a45c3b00b000001"), ObjectId("521fb0848275044d420d392f"), ObjectId("52207c2f7c53a8494f010afa"), ObjectId("5221c9df4910ba3874000001"), ObjectId("522275724910ba3874001f66"), ObjectId("5223c6f24910ba3874003b7a"), ObjectId("522518734910ba3874005763") ], "run_at" : ISODate("2013-07-11T22:00:00Z"), "scheduled_date" : ISODate("2013-07-05T04:41:17.067Z") } • Snapshot { "_id" : ObjectId("51d89367bcb161366600005f"), "date" : ISODate("2013-07-06T22:00:07.015Z"), "work" : ObjectId("521f761c0a45c3b00b000001"), count: 123 } • Results { "machine" : { "trojan" : [ “conficker_b“ ], "ip" : "2.80.2.53", "host" : "Bl19-1-13.dsl.telepac.pt", }, … , "metadata" : { "work" : ObjectId("521f837647b8d3ba7d000001"), "snaptshot" : ObjectId("521f837aa669d0b87d000001"), "date" : ISODate("2013-08-29T00:00:00Z") }, }
Reports – Evolution • Other reports formats. • Charts? • Other type of reports. (Notonlybotnets). • Need to evolve Collectorfirst.
Globe • How to visualize real time events from the stream? • Where are the botnets located? • Who’s the most infected? • How many infections?
Globe – Stream • origin = banktrojan • Modules • Group • trojanfamily • _geo_env_remote_addr.country_name • grouptime=5000 • Geo • Filter fields • trojanfamily • Geolocation • _geo_env_remote_addr.l* • KPI • trojanfamily • _geo_env_remote_addr.country_name • kpilimit = 10 • Request botnets from stream
Globe – NodeJS • NodeJS • HTTP • Get JSON from Stream. • Socket.IO • Multiple protocol support (to bypass some proxys and handle old browsers). • Redis • Get real time number of infected machines.
Globe – Browser • Browser • Socket.IO Client • Real time apps. • Websockets and other types of transport. • WebGL • ThreeJS • Tween • jQuery • WebWorkers • Runs in the background. • Where to place the red dots? • Calculations from geolocation to 3D point goes here.
Globe – Evolution • Some kind of HUD to get better interaction and notifications. • Request actions by clicking in the globe. • Generate report of infected in that area. • Request operations in a specific that area. • Real time warnings • New Infections • Other types of warnings...
AddingValuableInformation to StreamEvents • How to distribute workload to other machines? • Adding value to the information we already have.
Minions • Typically the operations that would had value are expensive in terms of resources • CPU • Bandwidth • Master-slave approach that distributes work among distributed slaves we called Minions.
Minions Minion • Master receives work from Requesters and store the work in MongoDB. • Minions request work. • Requesters receive real time information on the work from the Master or they can ask for work information at a later time.