600 likes | 619 Views
StrangerDB -- Safe Data Management with Untrusted Servers. Dennis Shasha (shasha@cs.nyu.edu). Goals. Store private data in a public database: backup, concurrency control, and some query processing Protect data from being observed (privacy) Make unauthorized modifications evident (safety)
E N D
StrangerDB --Safe Data Managementwith Untrusted Servers Dennis Shasha (shasha@cs.nyu.edu)
Goals • Store private data in a public database: backup, concurrency control, and some query processing • Protect data from being observed (privacy) • Make unauthorized modifications evident (safety) • Force server to deliver a consistent picture to all honest users or be discovered (consistency). • Dishonest users have the same effect as users who enter bad data.
Methods • Encryption per user/group for privacy. • Signatures for tamper-evidence • SUNDR-style [1] maintenance to detect inconsistent transaction orders. 1. "Building secure file systems out of Byzantine storage", David Mazieres and Dennis Shasha, Principles of Distributed Computing, 2002. pp. 108-117.
Database Setup for Privacy • A record is a sequence of cleartext field values plus an encrypted part that may encompass one or more fields. • The encryption is known to one or more users. • Decryption is done at the most public processor possible: user’s workstation or smartcard if private to a user, workstation of group member if decryption pertains to group-owned information. • Encryption can be private key encryption.
Privacy Related Optimization Problems • There are classical optimization problems to be solved in this framework. • Example: if some data is private to a user and other data belongs to a group, do I do the group processing first and then bring the result to the private workstation or do I bring all the data to the private workstation right away?
Work Related to PrivacyConsiderations • Hakan Hacigumus, Bala Iyer, Chen Li and Sharad Mehrotra. "Executing SQL over Encrypted Data in the Database Server Provider Model.“ ACM Sigmod 2002 – advocates a field by field encryption idea; they map queries to encrypted values. Sometimes encryption preserves order and sometimes not. • Matthias Fischmann and Oliver Günther “Privacy Tradeoffs in Database Service Architectures,” (BIZSEC'03) – points to security leaks in this model if the adversary can ask queries. Even if not, encrypted fields yield information about the number of distinct values and their distribution.
Work Related to PrivacyConsiderations • “Hippocratic Databases” Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, Yirong Xu, VLDB 2002. Argues that databases should provide mechanisms to preserve privacy including properties like consent of information donor, limited use, limited retention etc. • Encryption gives consent of information donor only. Once I give you my key, I have little further control. Changing the key does prevent recipient from learning new data.
Our Take on PrivacyConsiderations • We are agnostic: you encrypt what you want and issue queries and updates to achieve your privacy goals. • For purposes of this talk:a database designer knows exactly which information is revealed to non-owners: everything that is in the clear. • Non-owners may not issue queries to or modify your data.
Tamper Evidence (safety) • Every data item can be modified by exactly one user or group. A modifier signs a collision-resistant hash of the encrypted result of the data after modification. • Note: Need trusted public repository of public signature keys (e.g. provided by company security officer)
Collision-Resistant Hash Setup:Inspired by Merkle Trees sgn_user(HASH (root), ptr) DATA RECORD 1 HASH1, ptrs DATA RECORD 2 HASH2, ptrs DATA RECORD n
Malicious servers and forks • If a user u accesses a database at time t, the user wants to be sure that the database is current as of time t. • A malicious server might give u a database state reflecting only some previous updates. • Users cannot prevent such “forking attacks” but would like to discover them quickly.
Underlying Strategy to ensure inter-user consistency • Periodically, every pair of users exchange their ideas of global history. If one member of the pair has missed an update done by the other, the histories won’t be consistent. • For this to work in practice, we need some encoding of that global history.
Underlying Strategy: intuition Bob: “Alice do you agree that Mary said X?” Alice: “No way. I never heard that!” Bob: “But here is Mary’s signed statement to that effect.” Alice: “Well I guess the server is messing with history then.”
Strawman Implementation:Log of Global Operations • Imagine that we have a sequential log consisting of every transaction that ever hit the database and that this sequential log is signed by the last transaction. • Ex: order of transactions is T1 T2 T3 (done by users u1 u2 u3). Log after T3: • sgn_u3(T3 sgn_u2(T2 sgn_u1(T1)))
Ensuring Individual Consistency • Log is held by the untrusted server. • Every time a user appends to and signs the log, the user first checks that the log he/she previously signed is a prefix of log to be signed. • Ex: if u is about to commit transaction T’ and has previously committed T, then u makes sure that the log now contains as a prefix the log from the time T was committed.
Individual Log Consistency Bob: “In my previous update, the log had (in left to right order): Talice1, Tbob1 Now it has Talice1, Tbob1, Tmary1. So my previous view of the log was a prefix of the current one. Ok, so I’ll append my new transaction: Talice1, Tbob1, Tmary1, Tbob2 Alice hears nothing of all this.
Ensuring Global Consistencyby detecting forking attacks • Periodically, users exchange their ideas of the global log. Each user verifies: • The signatures of all the users • Whether one global history is a prefix of the other or not. • If there has been a forking attack and u1 has not seen a transaction of u2, then neither user’s log will be a prefix of the other.
Global Log Consistency Bob: “Alice, here is the log of all transactions as I see it: Talice1, Tbob1, Tmary1, Tbob2 Alice: “That’s funny. Here is my log Talice1, Tbob1, Tjill1. It is not a prefix of yours because I have Jill’s transaction, but yours is not a prefix of mine because you have Mary’s transaction.” Bob: “Are all the signatures good?” Alice: “Absolutely. See for yourself. Server is being naughty again.”
Semantic Objection • Server might fail to update the data but assert that the transactions are executed in the same global order. • Fix: associate with each transaction a collision-resistant hash of the state of the whole database. Call these h1, h2, h3 • sgn_u3(T3 h3 sgn_u2(T2 h2 sgn_u1(T1 h1))) • Transactions verify global hash upon data access.
Hashes of all the data Bob: “Alice, the log says you were the last to execute, yet when I perform a collision-resistant hash of the database, the result Is not consistent with that hash.” Alice: “It’s lucky I signed the hash thatI placed in the log. That shows the stateof the database I think is present.” Bob: “Darn server has been changing the data again!”
Practical objection: space grows without bound • In this log-based (strawman) implementation, each user keeps log of all transactions ever done! • Alternative is to have each user update his/her version for every access (even read-only access). • A “version structure” is basically a set of user-version pairs + a hash of the data of the signer. • Space per version structure proportional to number of users N. Because each user needs to keep the latest version structure for each user, the total space per user is N2
Version Structure Detail • Suppose user u creates the last version structure. Then u increments his/her version number (and no other) and signs the structure, which contains:sgn_u(hash of data owned by u, (u1, n1), (u2, n2), …)where (ui, ni) means: ui is at version ni. • From now on, call hash of the data owned by u “hash(udata)”
Basic Properties of version structures • Because of the signature, the server cannot forge a version structure. • Because of the collision-resistant hash, each user’s data can be verified to be what that user intended. • Each user maintains a “version structure list” of the most recent version structure from each user.
Use of Version Structure List Bob: “Alice, according to the version structure list you were the last to execute, yet when I perform a collision-resistant hash of the database, the result Is not consistent with that hash.” Alice: “It’s lucky I signed the hash thatI placed in the version structure list. That shows the stateof my data I think is present.” Bob: “Darn server has been changing the data again!”
Three incrementally related version structures Bob: sgn_Bob(hash(Bobdata), (Bob, 6), (Alice, 12), (Bill, 4)) Alice: sgn_Alice(hash(Alicedata), (Bob,6), (Alice,13), (Bill,4)) Bob: sgn_Bob(hash(Bobdata), (Bob,7), (Alice, 13), (Bill,4))
Ordering Properties of version structures • Define a partial order on version structures:vs1 < vs2 if the users in vs1 are a subset of the users in vs2 and for every user u in vs1, the version of u in vs1 (denoted vs1[u]) is less than or equal to vs2[u] and for at least one user v, vs1[v] < vs2[v]. • We say vs1 is “incrementally less than” vs2 if there exists a u such that u signs vs2, vs2[u] = vs1[u] + 1 and for all v, if v != u then vs2[v] = vs1[v].
Version Structure Construction • User u forms its new version structure vs_u as follows:u first examines the previous version structure that u signed vs_u_old and sets vs_u[u] = vs_u_old[u]+1. • Next u examines the last version structure vs_v signed by each other user v and sets vs_u[v] = vs_v[v]. • In this way u creates a version structure that reflects the last signed version of every user.
Signing Verification Protocol: Part I • When a user u is ready to sign the version structure vs_u constructed as above, u checks that 1. the highest version number for every user v is in the last version structure vs_v signed by v. (Other version structures may have the same highest version number for v as well, but they may not exceed vs_v[v].) 2. There is some ordering such that each version structure in the list is incrementally less than the next one on the list and vs_u is the greatest.
Signing Verification Protocol: Part II • The set of all data belonging to a user v is hashed to a value hash(vdata) as of the last signed version structure of v: vs_v. • When user u reads v’s data, it checks v’s data against hash(vdata) to verify that v’s data hasn’t been changed since the signing of vs_v. • If both the parts of the protocol succeed, then u signs the version structure vs_u and commits the transaction.
Signing Verification Protocol Bob: “Alice, here’s the drill. You issue a transaction. It accesses data from many people. You check that the data you have read from each person is consistent with the signed hash on his/her last version structure.” Alice: “How do I know it’s that person’s last version structure?” Bob: “Good question. You check that all the version structures are incrementally related to one another. You are checking that the server is consistent in what it tells you.” Alice: “That’s not enough is it?” Bob: “No, but forking will leave traces of guilt.”
Forking attacks on honest clients createincomparable version structures • If the server fails to show user v the version structure vs_u produced by user u, the version structure that v will sign, call it vs_v, will have the property vs_v[u] < vs_u[v]. Once v signs, vs_v[v] > vs_u[v]. • So vs_u and vs_v will be unordered by <. • The signing verification protocol will still succeed. So we need a global protocol.
Forking creates incomparable version structures Bob: sgn_Bob(hash(Bobdata), (Bob, 6), (Alice, 12), (Bill, 4)) Alice: sgn_Alice(hash(Alicedata), (Bob,6), (Alice,13), (Bill,4))Server forks and doesn’t show Bob this. Bob: sgn_Bob(hash(Bobdata), (Bob,7), (Alice, 12), (Bill,4)) Now, Bob and Alice’s last version structures are incomparable, i.e. unordered by <.
Version Structure Exchange I • Users periodically perform a global version structure exchange protocol. Let us say that such a protocol begins at global time t. Every user u sends the most recent version structure that u signed before time t to every other user. Call that vs_u. • When a user v receives vs_u from user u, then user v performs a “well-formedness” test: v compares its most recent version structure signed before t, call it vs_v, with vs_u. They should be ordered by < and vs_v[v] >= vs_u[v] and vs_v[u] <= vs_u[u].
Version Structure Exchange II • If v performs a well-formedness test for every user in U and the version structures from those users are all ordered by <, then v declares those version structures to be all well-formed. • If every user v in some set of users U declares the version structures it receives from users U to be well-formed, then the global structure exchange is said to succeed for U.
Global Version Exchange Protocol Bob: “Alice, from time to time, a global version exchange protocol begins. Let’s say an instance of the protocol starts at time t. Every user sends its latest version structure preceding t to all other users. Sending is done without mediation by server.” Alice: “Then what?” Bob: “Each user checks that the version structures are well-formed. Alice: “What if some user does not send?” Bob: “No problem. Validate the ones that do send.”
Version Structures and Serializability • Serializability will be based on version structure order. • That is, transactions will serialize in the < order of version structures.
Role of concurrency controlin correctness • Locking is merely a heuristic that the server uses to delay transactions and therefore to give a serial order to version structures. • If the server cheats and allows accesses that violate locks, then the version structures won’t be ordered by <. • Later caught by the signing verification protocol or the global exchange.
The interesting case ofmultiversion read consistency • Effectively, a multiversion read consistent transaction should make its version structure reflect its start time. So, the user associated with such a transaction signs its version structure when it starts, then starts reading. • If that transaction never commits (because some data has changed and transaction detects this by looking at a hash), there is no damage because the database won’t change and the application issuing the transaction will receive a failure as it should.
Proof Strategy • If all users are honest, but the server may not be, then the theorems are not that hard to prove. • If some users could be dishonest, then we could have major problems, e.g. they could corrupt the data. But this is like any data corruptor. • So, we quarantine them in our proofs: we concern ourselves only with honest users having no data dependency on dishonest ones. • We call those “virtuous users.”
Serializability Lemma • Lemma: If T1 T2 (conflict edge from T1 to T2), vs1 is the version structure signed by user u1 for T1, vs2 is the version structure signed by user u2 for T2, all version structures among some set of virtuous users U including u1 and u2 are ordered, then vs1 < vs2. • Proof: Suppose user u1 issues T1 and user u2 issues T2. For any conflict, there is some data item x such that op1(x) precedes op2(x).
Lemma continued • write-read: there is an x such that W1(x) precedes R2(x). Therefore R2(x) must occur after vs1 has been signed by u1, because u2 will verify hash (u1data), so vs1[u1] <= vs2[u1]. Moreover, the temporal ordering implies that vs1[u2] < vs2[u2]. Finally, because vs1 and vs2 are ordered, vs1 < vs2. • write-write: very similar to the write-read case. • read-write: there exists an x such that R1(x) precedes W2(x). Therefore vs1[u2] < vs2[u2]. Otherwise R1(x) would either read from W2(x) or from some later value. Because version structures are ordered, vs1 < vs2. Done.
Total Ordering Lemma • Lemma: Suppose the global version structure exchange begins at t and ends successfully at t’ for some set of virtuous users U. Assuming every user in U has been following the signing verification protocol up to time t’, then all version structures among U are ordered up to time t.
Total Ordering Lemma I • Prove the contrapositive: Consider version structures vs1 signed by user u1 and vs2 signed by user u2 before time t, where both u1 and u2 belong to U such that vs1[u1] > vs2[u1] and vs1[u2] < vs2[u2]. • That is, the version structures are incomparable. • Then either the signing verification protocol of some user or the global version structure exchange that begins at time t will be unsuccessful.
Total Ordering Lemma II • Consider the next version structure signed by u1, call it vs1’. At that moment u1 will know that there has been a fork if u1 sees vs2 during the signing verification protocol (because vs1’[u2] < vs2[u2]). So, assume the server will not show vs2 to u1 and hence vs1’ > vs2 will be false.
Total Ordering Lemma III • If server’s forking not yet discovered, then vs1’ and vs2 are unordered, so the argument of the last slide holds for any subsequent version structure vs1’’ signed by u. Symmetrically, no subsequent version structure signed by u2 such as vs2’ will have the property that vs2’ > vs1. • Therefore when the global version structure exchange occurs, u1 and u2 will discover a lack of well-formedness. Done.
Notes on Implementation • Server avoids being framed • Concurrency control • Version structure commits – how to make them efficient? • Supporting cryptographic assumptions and global verification protocol. • View maintenance • Read-write asymmetry. • Indexing
Server is framed Bob (good): sgn_Bob(hash(Bobdata), (Bob, 6), (Alice, 12), (Bill, 4)) Alice (good): sgn_Alice(hash(Alicedata), (Bob,6), (Alice,13), (Bill,4))Server shows this to Bob. No fork. Bob (bad): sgn_Bob(hash(Bobdata), (Bob,7), (Alice, 12), (Bill,4)) Bob pretends server has forked.Upon global exchange, server is framed.
Server can avoid being framed • Server signs the version structures from users if it agrees they are legitimate. In the case of previous figure, server will refuse to sign Bob’s second version structure. • Bob and the server can present their evidence before security officer.
Server Proves Innocence Bob (good): sgn_server(sgn_Bob(hash(Bobdata), (Bob, 6), (Alice, 12), (Bill, 4))) Alice (good): sgn_server( sgn_Alice(hash(Alicedata), (Bob,6), (Alice,13), (Bill,4)))Server shows this to Bob. No fork. Bob (bad): sgn_Bob(hash(Bobdata), (Bob,7), (Alice, 12), (Bill,4)) Server refuses to sign. Shows that Bob is bad.
Concurrency Control • Accesing v’s data can be done by locking hash (vdata). • To increase concurrency, partition v’s rows into k parts, each with its own hash. The user u would then write k hashdata values in the version structure. • Transactions will lock the appropriate hash values.