350 likes | 510 Views
Providence, USA, 2009. Yin Yang , Dimitris Papadias, Stavros Papadopoulos H KUST, Hong Kong Panos Kalnis KAUST, Saudi Arabia. Authenticated Join Processing in Outsourced Databases. Database Outsourcing. Advantages
E N D
Providence, USA, 2009 Yin Yang, Dimitris Papadias, Stavros Papadopoulos HKUST, Hong Kong Panos Kalnis KAUST, Saudi Arabia Authenticated Join Processing in Outsourced Databases
Database Outsourcing • Advantages • The data owner does not need the hardware / software / personnel to run a DBMS • The service provider achieves economy of scale • The client enjoys better quality of service • A main challenge • The service provider is not trusted, and may return incorrect query results
Query Authentication • The owner signs its data with a digital signature scheme • Given a query, the service provider attaches a VO (Verification Object) to the results • The client verifies query results with the VO and the owner’s signature • soundness • completeness
Example Queries Range:σquantity>100Purchase Join: PurchasecidCustomer Range & Join :(σquantity>100Purchase) cid(σcity=“New York”Customer)
State of the Art • Range authentication: many solutions • Join authentication: few proposals • Materializing join results into views • AINL (presented in detail later) • Joins are inherently more complex than ranges • A join combines information from multiple tables • Only individual tables are signed
Previous Work • Multi-dimensional range authentication • Y. Yang, S. Papadopoulos, D. Papadias, G. Kollios (BU) • ICDE’08, VLDB J. • Continuous range authentication • S. Papadopoulos, Y. Yang, D. Papadias • VLDB’07, VLDB J. • Novel authentication framework • S. Papadopoulos, D. Saccharidis, D. Papadias • ICDE’09
Background • Concepts in Cryptography • Authenticated Data Structure (ADS) • Merkle Hash Tree • MB-Tree • AINL
Concepts in Cryptography • One-way, collision-resistant hash functions • h = H(m) • Computationally infeasible to infer m from h, or to find two m1, m2 with the same hash value h • Example: SHA1, SHA2, … • Public-key encryption • Two keys: private key sk, public key pk • Public key to encrypt, private key to decrypt • Example: RSA • Digital Signature • Hard to forge without the secret key • Signing: s = encrypt(H(m), sk) • Verifying: check if H(m) = decrypt(s, pk)
Merkle Hash Tree (Merkle, Crypto’89) • A binary tree with hash values satisfying hn = H(hn.lc | hn.rc) • Authenticates 1D range queries • Example: a query Q retrieves d4, d5 • VO(Q)= {sroot, h1-2, d3, d4, d5, d6, h7-8} • The client re-constructs hRoot bottom-up, and verifies the signature
Merkle B-Tree (Li et al. SIGMOD’06) • Merkle Hash Tree + B-Tree • Conceptually, a Merkle Hash Tree with a large fanout (>2)
AINL • For binary joins • Requires ADS on the join attribute of the inner relation • Reduces a join query into multiple ranges • Algorithm • For every tuple in the outer relation Perform an authenticated range on the inner relation
Example of AINL r2 r1 r1, hF, h10, s11, s12, hE r2, h1, s2, s3, s4, h5, h6, hC, hG …
Drawbacks of AINL • Large VO size • |R| records from R(outer relation) • 2|R|+|RS| records from |S|(inner relation) • Numerous hash values • Often larger than the combined size of R and S • High computation overhead at the server and the client
NAI: A Naïve solution • The server transmits all the data to the client • The client performs the join locally • NAI often outperforms AINL
Proposed Methods • Binary join authentication • AISM: requires ADS on one relation • AIM: requires ADSs on both relations • ASM: requires no ADS • Complex join query authentication • Multi-way join • Select-project-join
AISM: Query Processing • Sort the outer relation R on the join attribute • Transmit all tuples in R to the client in their verifiable order • Transmit the sort order R of R tuples on the join attribute • Incrementally traverse the ADS on S once with the R records
Example of AISM R[1]=2 R[2]=4 R[5]=3 R[6]=5 R[4]=1 R[3]=6 r4 r1 r3 r5 r2 r6 VO: signature of R, root signature of TS, r1-r6 in their verifiable order R[1], h1, s2, s3, s4; R[2], h5, h6, hC, s10, s11, s12; R[3]; R[4]; R[5], h13, h14, s15; R[6];
AISM: Verification • The client checks • R records • correctness of the sort order R of R • boundary records • whether the re-constructed root hash of TS matches its signature
AIM • Query processing • Require ADSs on both relations • Start with one relation R, traverse its ADS TR down to the first tuple r1 • Traverse TS until reaching the right boundary record sof r1 • Traverse TR until reaching the right boundary record rof s • Alternatively traverse TS and TR similarly to the above • Verification: similar to AISM
Example of AIM VO: root signature of TS, root signature of TR, r1 hs1, s2, s3, s4; r2; hs5, hs6, hC, s10, s11, s12; r3, r4; r5; hs13, hs14, s15; hr6;
ASM • Idea • Sort-Merge-Join, sort at the server, merge at the client • Query processing • Require no ADS • Transmit both R and S in their verifiable order • Sort R and S respectively on the join attribute • Transmit the sort orders of R and S to the client • Transmit bitmaps BR and BS to the client, indicating the tuples with join partners • Verification • correctness of the base relations / sort-orders / the bitmaps
Complex Query Authentication • Multi-way joins • Selection-Projection-Join queries
Multi-Way Join • Build a tree of binary join operators • m-ASM / m-AISM / m-AIM optimized for multi-way joins • Example: • A specialized algorithm AST applies when all relations are joined on the same attribute • One single VO
Example of m-AIM and m-AISM VO(RS):{root signature of TR and TS, s1, s2; hA, r4, r5, r6; s3; s4; s5; hC} VO(RST):{root signature of TT, [1], t1, t2; [2]; [3]; [4]; ht3}
Selection-Projection-Join Query • Selection • Use the m- algorithms for joins • Projection • Build a Merkle Hash Tree for each record • Query optimization
Experiments • Three synthetic relations • R(a1, a2) • S(a1, a2, b1, b2) • T(b1, b2) • Queries • R a1S • R a2S • (R a1S) b1T • (R a2S) b2T • Foreign keys • S.a1 references R.a1 • S.b1 references T.b1 • Parameters • Tuple size • Cardinality of |S|
Repeatability and Workability • We participated in the ACM SIGMOD 2009 Repeatability & Workability Evaluation (cf., http://homepages.cwi.nl/~manegold/SIGMOD-2009-RWE/). • The reviewers were able to • repeat all the experiments presented in our paper, • yielding results that match the ones published in our paper, • except from insignificant and to be expected variation due to randomness and/or hardware/software differences. • The detailed reports will shortly be made publicly available by ACM SIGMOD.
Conclusion • Binary join authentication • AISM: authenticated structure on one relation • AIM: authenticated structures on both relations • ASM: no authenticated structure • Complex query authentication • Multi-way join: eliminate unnecessary intermediate VO elements • Selection-projection-join query • Future Work • Authenticated Structures specialized to joins • Hash join instead of SMJ
Thank you! Questions?