1 / 21

Towards Secure Dataflow Processing in Open Distributed Systems

Towards Secure Dataflow Processing in Open Distributed Systems. Juan Du, Wei Wei , Xiaohui (Helen) Gu , Ting Yu. 1 /21. Outline. Introduction Design and Algorithms Experimental Evaluation Related Work Conclusion. 2 /21. Dataflow Processing in Distributed System. f 1. f 5. f 5.

maeve
Download Presentation

Towards Secure Dataflow Processing in Open Distributed Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards Secure Dataflow Processing in Open Distributed Systems Juan Du, Wei Wei, Xiaohui (Helen) Gu, Ting Yu 1/21

  2. Outline • Introduction • Design and Algorithms • Experimental Evaluation • Related Work • Conclusion 2/21

  3. Dataflow Processing in Distributed System f1 f5 f5 f3 f2 f2 …,f2(f1(di)),… …,f1(di),… f1 …,f3(f2(f1(di))),… …di,… …di,… f4 …,f3(f2(f1(di))),… Component provider Data processing component di ADU Dataflow 3/21

  4. Run in Open Distributed Systems • Dataflow Processing Applications • Network traffic monitoring • Sensor data analysis • Audio/video surveillance • Scientific data processing • Advantages in Open Distributed Systems • Highly scalable and available infrastructures • No need to maintain hardware and software • Challenges in Open Distributed Systems • Component providers come from different security domains • Not all data processing components are trustworthy 4/21

  5. ADU Attack f1 f5 f5 f3 f2 f2 … f2(f1(d1), d0 … f2(f1(d1) … f1(d2), f1(d1) f1 … d2, d1 f4 Component provider Malicious component Data processing component di ADU Dataflow 5/21

  6. Dataflow Topology Attack f1 f5 f5 f3 f2 f2 … f1(d2), … f1 f4 …f3(f5(f2(f1(d2)))), … …f3 (f2(f1(d2)))), … Component provider Malicious component Data processing component di ADU Dataflow 6/21

  7. Function Integrity Attack f1 f5 f5 f3 f2 f2 … f1(d2),… … f0(f1(d2)),… … f1(d2), … f1 f4 Component provider Malicious component Data processing component di ADU Dataflow 7/21

  8. System Design • Attack Models • ADU attack • Dataflow topology attack • Function integrity attack • Assumptions • Third-party component providers could be malicious • Composers and users are trusted • PKI is deployed in advance • Goals • Provide integrity and confidentiality for dataflow processing applications • Focus on discussing integrity issues 8/21

  9. Provenance-based ADU Protection • d • receipt • d • d • [sqn, session_Id, hash(d)]sign_s2 • “Receipt” packet • ADU dropping attack • s2 may claim it does not receive d • s1 may claim it sends d, but it doesn’t 9/21

  10. Provenance-based ADU Protection • f1 • f2 • f2(f1(d)) • d • f1(d) • [[h(d), h(f1(d))]sign_s1]key_c • [[h(d), h(f1(d))]sign_s1]key_c • [[h(f1(d)), h(f2(f1(d)))]sign_s2]key_c • input • output • input • output • Provenance evidence • Cached or carry-on evidence • Consistency verification between different components 10/21

  11. Dataflow Topology Protection • C s1 s2 s3 C • C sig_c sig_c sig_c sig_c key_s1 key_s3 key_s2 [s1][s2][s3][C] C • f1 • s1 • f2 • s2 • s3 • f3 • C • Cascading topology encryption • Any component cannot change the dataflow topology • Each component only knows its previous hop and next hop 11/21

  12. Dataflow Topology Protection • C s1 s2 s3 C • C sig_c sig_c sig_c sig_c key_s1 key_s3 key_s2 [s1][s2][s3][C] C • f1 • s1 [s1]sig_c[s2]sig_c[s3]sig_c[C]sig _ c key_s3 key_s2 • f2 • s2 • [s2]sig_c[s3]sig_c[C]sig _ c key_s3 • s3 • f3 • [s3]sig_c[C]sig _ c • C • Cascading topology encryption • Any component cannot change the dataflow topology • Each component only knows its previous hop and next hop • Onion routing [Goldschlag, et al., 1999] 12/21

  13. Function Integrity Attestation • f1 • f2 • s1 • f1(d1) , f1(d3) • s5 • f2(f1(d1)) , f2(f1(d3)) • d1 • d3 • f1(d2) • s6 • s2 • f2(f1(d2)) • d2 • d3 • d2 • d1 • C • C • d3’ f1(d3’) • f2(f1(d3’)) • s3 • s7 • d2’ f2(f1(d2)) = = f2(f1(d2’)) ? • s8 • s4 • f1(d2’) • f2(f1(d2’)) f2(f1(d3)) = = f2(f1(d3’)) ? • Randomized data attestation • Achieve scalable function integrity attack detection • Duplicate a random subset of ADUs • Send duplicates to selected functionally equivalent components • Check result consistency • Continuously perform randomized data attestation 13/21

  14. Implementation and Experimental Setup • Implementation • Implement a prototype of the secure dataflow processing • Follow the design of the IBM System S • Experiment setup • Conduct experiments on Planetlab • Use about 200 hosts • One host represents one component provider • Composer deployed on a pre-defined Planetlab host 14/21

  15. Evaluation • Overhead caused by basic protection schemes • Randomized data attestation • Overhead • in terms of dataflow processing delay • (time of dngetting out - time of d1 getting in ) / n • Detection probability • non-collusion • collusion 15/21

  16. Overhead of Basic Protection Schemes The overhead is about 10~15% for both secure dataflow schemes

  17. Overhead of RandomizedData Attestation • # of redundant components k = 5 • data size = 1KB • data rate = 10 ADUs/sec • duration = 30s • Avg dataflow processing delay increases with the number of redundant components used • Due to sub-optimal dataflow topology

  18. Detection Probability Detection probability increases with duplication probability puand number of redundant components used Detection is harder in collusion scenarios than that in non-collusion scenarios 18/21

  19. Related Work • Distributed dataflow processing • Focuses on resource and performance management issues • Assumes that data processing components are trustworthy • Trust management in distributed systems • Distributed messaging systems [Haeberlen, et al. SOSP 2007] • Pub-sub overlay [Srivatsa, et al., CCS 2005] • None of them addressed secure and scalable dataflow processing in open distributed system • Byzantine fault-tolerance • in Wide area networks [Amir, et al., DSN 2006] • No trusted party 19/21

  20. Conclusion • Finished Work • The first attempt to address the integrity of dataflow processing application delivery on open distributed systems • Identify and classify major security attacks • Propose a set of effective protection schemes • Future Work • Non-linear dataflow topology • Integrity attestation on stateful function • Further identify malicious component 20/21

  21. Thank you • Questions? 21/21

More Related