220 likes | 232 Views
A Case for High-Bandwidth Monitoring. Michael DePhillips, Dimitrios Katramatos. Agenda . Issue (security based) Definitions Current solution Current/Novel technologies Using BNL ’ s AoW Work so far What ’ s next
E N D
A Case for High-Bandwidth Monitoring Michael DePhillips, Dimitrios Katramatos
Agenda • Issue (security based) • Definitions • Current solution • Current/Novel technologies • Using BNL’s AoW • Work so far • What’s next This is based on AoW; presented at this conference last year by Dimitri Katramatos Special Thanks Shilpi Bhattacharyya
Define High-bandwidth • 100 gbs • In use – commodity hardware available (mellanox, cisco…) • 400 gbs • IEEE 802.3 active with ad hoc, study groups and task forces – meet every 6 months or so – with a standard expect next year. Successful implementations. • Tera-bit • Being discussed – researched – wiki page
Motivations and Bottom Lines Lessons Learned from AoW • Agile service delivery and scalability (processing) • Assign as many resources you need – from a server to a data center – FGAs and GPU • Run your own algorithms • Bandwidth Scalability – divide and conquer larger volumes of data to terabit and beyond • Federation on WAN – pass information regarding traffic across the network ( attacks ) • Cost – vendor independence – COTS • Overlay Technology (SDN Portion) – increment implementation, current infrastructure remains
Security (with regard to HBW) • Conventional security techniques (e.g., IDS – can not keep up <period>.</period> • Subset of the flow is current state-of-IDS for 100gbs • Common security solutions do not scale. • Industry develops and security catches up • LAN centric solutions
Current Solutions:acceptable/optimal – cost vrs risk (consequence) • Science has been leveraging 100 gb for a while now (~ years ) • Elephant flows (packet size/ bandwidth/ duration) • Experiments • HPCs • Reasonable – Solutions (work-around) PROBABILISTIC EXEMPTIONS • When you can’t solve a problem, solve part of it – (be smart about which part)
100 gbs BroSane and Elegant • Berkeley – • Probabilistic • Trust based solution • Shunts trusted traffic (science – elephant flow) • Capture meta data • Analyzes remainder
DMZScience network • Separate esnet network • Provides separation • Data could do no harm • Sandbox capabilities • no immediate analysis on flows • Not air gapped
Problems • Controls – limited due to international collaboration • APTs • government systems (no air-gap thus potential access to) • Impostor packets – Science data is has consistencies • Cost – commodity / vendor solutions expensive not scalable • Destination based (LAN) • Not that dire though – its not that easy to hack Still with no eyes on the stream – it’s like an adversaries sandbox
Rational to improve (Case) • Faster networks are coming • Soon to business and the desktop • Government systems – • Low probability – high consequence • Sophisticated and dedicated adversaries (APTs) • incentive to start to look at • Intelligent networks show promising results
BNL – INTELLIGENT NETWORKSAoW • More data in transit than at rest at any given time • SDNs and peripheral technologies (NFVs, OpenFlow) have matured enough to develop a FRAMEWORK that replaces (or overlays) conventional networks which can guarantee QoS, scalability and provide data analysis – In principle.
SDN Architecture https://www.opennetworking.org/sdn-resources/openflow
Apply - Security Algorithms • IDS like behavior – however with compute capabilities sky’s the limit • Currently looking for a preponderance of bad traffic – (some AI is being used) • FPGA and GPUs to accelerate • Latency not an issue
What did we do ? • We built an SDN • ODL as a controller • Vagrant as a VM creator • OVS as switches • With two service functions (include forwarders) • Created proper packets with python • Moved packets through the network
What did we learn ? Packet flows from host to classifier to service function (operations and forwarder) TCP dump watches the SF ip - collects copy of the packet In this case - code acting on the tcp dump data Minimal latency Container-based (docker) infrastructure ( lightweight and fast ) Any algo will work (sky is the limit)
Next ALGO …..from classifier or agent information….. elephant() – could be part of infra-structure that supports a service function - (lots of examples - WE CONTROL THE ENVIRONMENT) if (elephant() !=0) //confirm { /*…handling the flow in sf1*/ if (isExecutable() !=0) { drop();save();} //maybe handled by another sf else { letPass();} } else { /*…handle the flow in sf2…more scrutiny..*/ }
Next Challenges (technical) • Symmetric Hashing – keeps flow whole –return trips • Increase the flow rate • Move to hardware • Scale to 40 then 100 gbs • Principle to practice
Conclusions – In the case of high bandwidth monitoring… • Cost of advancing technology may be less than continuing to implement peripheral solutions • Vendor equipment is expensive not scalable • Customizable algorithms will alter security framework and current paradigms – not necessarily binary (good or bad = keep or drop). • Study – added intelligence • Infrastructure suggested will lead to proactive defense and response via federation (across all bandwidth)