1 / 32

Responsive Security for Stored Data

This research focuses on securing stored data in distributed repositories, ensuring availability, integrity, and confidentiality even in compromised scenarios. The study introduces a hybrid approach combining pure replication and secret-sharing for improved security and performance. Various protocols and techniques are discussed in detail to achieve this goal.

dau
Download Presentation

Responsive Security for Stored Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Responsive Security for Stored Data Subramanian Lakshmanan Mustaque Ahamad H. Venkateswaran College of Computing Georgia Institute of Technology

  2. Introduction • Problem definition • A distributed data repository. • Guarantees availability, integrity and confidentiality of stored data in the face of a limited number of compromised nodes. • Better performance. • Organization • Motivation • Existing techniques • Our approach • Related work • System architecture and protocols • A simple analysis

  3. At your service! Get his medical records, fast! I don’t want the data to be lost, never! Could prove vital! Hope the records have not been tampered Motivation Hope no one is looking at my tax documents Would any one know about this?

  4. So will I! Having a copy always helps I will pass on what I see to you, real fast! Approach I will be faithful anyway I’m not going to trust any of you guys. No one is going to get all the information Wish I could capture more nodes! I can’t even wipe or corrupt the data, let alone leaking info! I don’t want to talk to ALL of u, takes hell a lot of time

  5. E(d,k) S1 E(d,k’) E(d,k) S2 E(d,k) E(d,k’) C E(d,k’) E(d,k) S3 E(d,k’) E(d,k) S4 E(d,k’) Pure replication • Periodic re-encryption • Client has to be present for re-encryption • Compromised server could retain old data

  6. Secret-sharing algorithms • A (b,k) secret-sharing scheme fragments data into k shares so that b shares give no information, b+1 give all the information. • A (b,2b+1) scheme guarantees confidentiality, integrity and availability of data in the face of a maximum of b compromised nodes. • Data shares can be renewed, recovered periodically by the servers, even in the absence of the client, in a purely distributed fashion tolerating a maximum of b malicious nodes.

  7. S1 f1 (1,5) scheme S2 f2 D (f1,f2..f5) f3 C S3 f4 S4 f5 S5 2. Suffers from the problem of related attacks A pure-secret sharing scheme 1. A write involves talking to all servers

  8. Our approach • Pure replication • Poor security tolerating malicious faults • Better performance (access cost, availability) • Pure secret-sharing • Better security tolerating malicious faults • Poor performance • Hybrid scheme • Do limited secret-sharing • Replicate the shares • Offer the benefits of both schemes

  9. Related work • Replication for Byzantine fault tolerance • Schneider’s state machine approach for fault tolerance • Secure FS, Practical Byzantine fault tolerance at MIT - Castro and Liskov • Quorum systems • Phalanx and Fleet – Reiter et. al. • Dynamic quorums – Alvisi et. El.

  10. Related work (contd.) • Secret-sharing • Shamir’s scheme based on polynomial interpolation • Detecting and recovering corrupted shares – Feldman, Pederson • Proactive secret-sharing, periodic share renewal and share recovery – Herzberg et. al. • PASIS at CMU. • Fragmentation-scattering for intrusion tolerance at LAAS, France. • Data dissemination • Epidemic algorithms for non-malicious environment, Demers et. al. • Dissemination in Byzantine environment – Malkhi et. al.

  11. Disseminate along a column Our system D f1 ,f2 ,f3 f3 f1 f2 Write along a row Read along a row Periodic share renewal • Pure secret sharing : number of rows = 1 • Pure replication : number of columns = 1

  12. Assumptions • N Servers S1..Sn. • Requests authenticated and authorized independently at each server, secure communication channels • Compromises of two different servers not related. • Chosen threshold value b, number of server failures to be tolerated. • Number of columns- c, number of rows –r, rc = n, c>b. • Protocols designed for chosen matrix dimensions and chosen threshold value.

  13. Read and write protocols Write(x,v) by Client C (b,c) 1. v v1,v2,…vc 2. Compute one-way function h(vi) = gvi 3. Form Verification string VS = h(v1)|h(v2)|..h(vn) sig = {uid(x),ts,v}KC-1 4. Choose a row k. for (m = 1 to c) send{“write”,uid(x),ts,vm,VS,sig} to sever Sk,m 5. Repeat 4 for different k until number of rows contacted l is such that c - b/l b+1.

  14. C fc,VS, f1,VS, f2,VS, l b/l b+1 Write protocol (b,c) D f1 ,f2 VS = h(f1)|h(f2)|..h(fn)

  15. Read and write Protocols (contd.) Read(x) by Client C 1. Choose a row k. for (m = 1 to c) send{“write”,uid(x),ts,vm,VS,sig} to sever Sk,m 2. Get a list of {ts, VS,vm,sig} from Sk,m 3. Choose the highest timestamp that occurs in b+1 or more replies with same VS. 4. If no such timestamp exists, repeat from 1 for a different k. 5. Pick shares corresponding to this timestamp. Pick b+1 shares that are verified successfully by VS. Reconstruct data value v from b+1 shares. 6. Return v if sig is valid, else repeat from 1 for different k

  16. C f1’,VS1, f2’,VS2, fc’,VSc, (b,*) 3. fi1 ,fi2 ,fib+1 D Read protocol 1. VSi1 = VSi2 = ..Vsi b+1 ? 2. h(fi’) = h(fi) in VS ?

  17. Data dissemination • Disseminate shares along columns • Increases availability and system performance • Better data sharing for shared data • Better support for mobile or roaming client • Replicated copies serve as back-ups

  18. f1 VS f2 VS fc VS f’2 f2 Remarks : 1. VS is accepted as valid only if either directly heard from client or b+1 other servers report same VS . Dissemination protocol 1. Detect/suspect corruption 2. Pull verification string from b+1 servers 3. Check if share is valid using VS 4. Do share recovery if share is corrupted 2. Disseminate to other servers only those VS that are accepted as valid.

  19. Share renewal • Assumption : In any timeframe of length Tv, an adversary can compromise a maximum of b nodes • Question : What happens over a time interval of length 2Tv? • Adversary compromises more than b nodes over a longer period of time. • Renew the shares at least once every Tv seconds. • Shares before share renewal do not make any sense with new shares. • Done by servers in the absence of client, distributed, secure against b compromised nodes.

  20. f1, VS f2, VS f3, VS f1 f2 f2’ f1’, VS’ f2’, VS’ f3’, VS’ Periodic share renewal Share renewal (contd.)

  21. Analysis • In any time fram of length Tv, a server can be compromised with probability p • Expected number of failures = np • Threshold value b, degree of replication r (or c) determine the level of security and performance offered by the system • Time taken to complete a read/write much less than Tv

  22. Security Metrics • Availability • Probability that a legitimate client can read a data item that has been written successfully. • Confidentiality • Complement of the probability that an adversary can read a data item that has been written successfully. • Integrity • Complement of the probability that any client could be given corrupted or modified data content when a read on a data item is done.

  23. Security metrics(contd.) 1. Availability():  = probability of finding at least b+1 non-faulty servers, each from a different column c • (b,c) = (c) (1-pr)i * (pr) (c-i) i i = b+1 2. Confidentiality():  = 1 - probability of finding at least b+1 malicious servers, each from a different column c • (b,c) = 1 - (c) (1-qr)i * (qr) (c-i) , q = 1-p i i = b+1 3. Integrity – same as confidentiality or depends on the strength of the underlying digital signature scheme

  24. Performance metrics • Read cost • Expected number of servers a client needs to contact to read a data item successfully. • Involves collecting b+1 distinct shares that are not corrupted. • (2b+1)/pr, pr – probability of a read completing successfully after contacting 2b+1 servers. • Write cost • Number of servers a client needs to contact to write a data item at a confidence level h. • h = probability of success = probability that at least one server from each of b+1 or more columns receive the write.

  25. Availability, Confidentiality as functions of b for constant c Availability Confidentiality

  26. Access costs as functions of b for constant c Read cost Write cost

  27. Availability and Confidentiality as functions of c for constant b Availability Confidentiality

  28. Access costs as functions of c for constant b Read cost Write cost

  29. Availability and Confidentiality against threshold value, c = 2b+1

  30. Access costs against threshold value, c = 2b+1

  31. Remarks • When access cost or availability is the most important metric to be optimized and confidentiality is not an issue, set r = n, c = 1, b = 0 (pure replication) • When confidentiality is the most important metric to be optimized and low performance is accepted, set r = 1, c = n, b = (c-1)/2 (pure secret-sharing) • Requirements on both security and performance would need combination of replication and secret-sharing •   1-10-3.5, access cost  22 servers => b = 10, c = 21 • Higher confidentiality => higher access costs and lower availability • Related attacks • Place servers vulnerable to similar attacks in same column

  32. Future work • Per object customizable security • Intrusion detection and correction • Dynamic inclusion and exclusion of servers • Implementation and experimental evaluation

More Related