GeoLoc : Robust Resource Allocation Method for Query Optimization in Data Grid Systems

GeoLoc:Robust Resource Allocation Method for Query Optimization in Data Grid Systems Igor EPIMAKHOV Abdelkader HAMEURLAIN Franck MORVAN Baltic DB&IS'2012

Table of contents • Introduction • Existing methods classification • Contributions • Allocation Space • Allocation Algorithm • Performance Evaluation • Conclusion

Introduction Data Grid • Heterogeneity • Dynamicity • Large Scale

Introduction Query processing Query execution Parsing Query rewrite Resource allocation Resource discovery

Introduction Problem Input: • Set of query operations (dependent) • Set of nodes • Distribution of Relations • Dynamic and Static characteristics of Data Grid Objectives: • Select optimal subset of nodes to allocate resources for query operations

Existing Methods Classification Control structure: Centralized Hierarchical Decentralized

Existing Methods Classification Algorithms: Heuristic Exact

Existing Methods Classification Static Strategies: Resource Allocation Execution Dynamic Resource Allocation Execution Hybrid Execution with Dynamic Reallocation Resource Allocation

Existing Methods Classification • Cooperation type: • Classic • Incentive-based • Economic / Reputation

Contributions • Allocation Space Restriction • Algorithm of Resource Allocation Parallelism: pipeline, intra-operation, inter-operation Distributed and duplicated relations

Allocation Space Source nodes Nearest nodes

Allocation Algorithm Assumptions • Each relation is distributed by N equal parts • Hybrid Hash Join algorithm • Results are being retransferred from the nodes • Memory is using for reducing I/O operations

Allocation Algorithm • Input: • All nodes with fragments of queried relations (1) • All nodes nearest to (1) Stage 1. Definition of Allocation Space CPU NET I/O Overall Node Bandwidth • Algorithm: • Selection of source nodes on the base of their performance • Placement of Scan operations • Generation of Allocation Space (source nodes + nearest nodes)

Allocation Algorithm • Input: • Query logic plan • Generated Allocation Space • Idea: • Parity in bandwidth between Scan and Join operations Stage 2. Generation of execution plan • Algorithm: • BEGIN • FOR each join DO • Count the time of source relations read and transferring, Tscan_exec • DO • Choose the most efficient node Neff from a set of AS for placing join operation • Add Neff to the join allocation plan, Pjoin • Estimate the execution time of join, Tjoin_exec • WHILE (Tjoin_exec > Tscan_exec) • Add Pjoin to the query allocation plan, Pquery • ENDFOR • END

Allocation Algorithm Query: R S R = R1U R2 S = S1U S2 R1: n1, n2 R2: n3, n4 S1: n5, n6 S2: n7, n8 Example n5 n2 n8 n6 n1 n3 n7 n4

Allocation Algorithm Query: R S R = R1U R2 S = S1U S2 R1:n1, n2 R2: n3, n4 S1: n5, n6 S2:n7, n8 Example n5 n2 n8 n6 n1 n3 n7 n4

Allocation Algorithm n25 n26 n14 n11 n12 n10 n19 n16 n13 n17 n15 n20 n18 n22 n24 n23 n21 Query: R S R = R1U R2 S = S1U S2 Allocation space n1, n4, n6, n7, n10 n11, n12, n13, n14 n15, n16, n17, n18 n19, n20, n21, n22 n23, n24, n25, n26 Example n5 n2 n8 n6 n1 n3 n7 n4

Allocation Algorithm Source Nodes n25 n26 n18 n19 n12 n13 n10 Allocation space n1, n4, n6, n7, n10 n11, n12, n13, n14 n15, n16, n17, n18 n19, n20, n21, n22 n23, n24, n25, n26 Resulted Execution Plan Scans: n1, n4, n7, n6 Joins: n18, n25, n10, n26, n13, n12, n19 n1 n4 n7 n6 Example Nodes’ Bandwidth: 2000 lines/sec Nodes allocated for Join Nodes’ Bandwidth: 1790 lines/sec 2000 lines/sec 1300 lines/sec 1500 lines/sec 1650 lines/sec 1920 lines/sec 900 lines/sec

Performance Evaluation Experimental conditions • Data Grid simulator • 6000 heterogeneous nodes • Simple, Average and Complex queries • Distributed and duplicated relations Comparison • Method GeoLoc • Method Gounaris2004

Performance Evaluation Optimization Time

Performance Evaluation Response Time

Conclusion Proposed method is: • Efficient • Scalable • Adapted to heterogeneous decentralized Data Grid Perspective: • Adaptation to the Dynamicity of Data Grid

Thank you for your attention!

GeoLoc : Robust Resource Allocation Method for Query Optimization in Data Grid Systems

GeoLoc : Robust Resource Allocation Method for Query Optimization in Data Grid Systems

Presentation Transcript

Chameleon: A Resource Scheduler in A Data Grid Environment

Query Optimization In Compressed Database Systems

SECE Geoloc

Multi-Query Optimization

CS 440 Database Management Systems

Robust Allocation of a Defensive Budget Considering an Attacker’s Private Information

Robust Monotonic Optimization Framework for Multicell MISO Systems

CMPT 454

Module 5 – Operating Systems

Query Optimization

Resource Allocation

Lecture 9 Query Optimization

Operating System Support for Space Allocation in Grid Storage Systems

Robust Optimization and Applications

Data -intensive Computing Systems Query Optimization (Cost-based optimization)

G-commerce

Query Processing and Optimization

Query Processing and Query Optimization

Grid Resource Allocation and Management (GRAM)

CPS216: Advanced Database Systems Notes 09:Query Optimization (Cost-based optimization)

XAL - An X ML AL gebra for Query Optimization