170 likes | 265 Views
Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement. Juan Rubio, Lizy K. John Charles Lefurgy Laboratory for Computer Architecture IBM Austin Research Lab The University of Texas at Austin, USA Presented by Sean Leather
E N D
Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement Juan Rubio, Lizy K. John Charles Lefurgy Laboratory for Computer Architecture IBM Austin Research Lab The University of Texas at Austin, USA Presented by Sean Leather Laboratory for Computer Architecture The University of Texas at Austin, USA
Commercial Systems • Computer systems running commercial workloads operate on large amounts of data • Researchers have noticed that performance is hindered by data accesses • System architecture trends point to a distributed storage model with a non-uniform access latency for disk and memory Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Data Placement • Goal: • Place data to reduce access penalties • Challenges: • It is difficult when looking at large amounts of data, a handful of processors and multiple operations • Uses of the data change with time Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
5 5 3 5 5 5 6 3 4 6 5 5 4 5 4 4 Data Placement: Example Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Approach • Static data placement • Applied before the workload runs • Organizes blocks of data across disks of the system to result in low number of remote accesses • Run-time data reorganization • Applied while the system runs the workload • Used to adapt the layout of blocks of data to characteristics of the workload Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Outline • Problem • Static data placement • Run-time data reorganization • Simulated Annealing (SA) • Evaluation • Summary Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Static Data Placement • Arranges blocks of data across disks of the system • Approach: • Use probabilistic knowledge about the workload to formulate a cost function • Obtain layout that minimizes the cost function • Update disks to reflect the resulting layout Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Run-time Data Reorganization • Approach: • Periodically run a reorganization routine for the whole system • Compute a cost function based on the description of some completed and pending operations • Determine changes to layout that would lower the cost function • Pre-fetch the data from the remote disks Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Simulated Annealing [1/2] • Choose a number of “steps” (iterations) • For each step • Randomly introduce a perturbation (a small change to the current combination) • Always accept the new alternative if it reduces the cost • Randomly accept some alternatives that increase the cost (uphill change) • Slowly decrease the uphill acceptance probability • Later steps are less likely to accept bad perturbations Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Simulated Annealing [2/2] • A randomizing algorithm, allows SA to quickly explore a vast design space • Accepting uphill changes allows SA to escape a local minima • Uphill changes can also be bad • We reduce the probability of accepting them as the exploration progresses Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
System • Full-system simulation (SimOS-PPC) • 4 x 4 cc-NUMA • 1 GHz CPUs, 512 MB per node, 7 disk units per node, 128-bit 100 MHz bus • 128 bit 200 MHz inter-node bus • Directory-based cc-NUMA • System runs AIX 4.3.1 Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Benchmark • DSS queries based on TPC-H • Database was populated based on a TPC-H database with a scale factor of 1 • Around 2.5 GB between table and indices • Data set of the queries ranged from 585 MB to 2.8 GB • Web interactions are based on TPC-W • DB2 was optimized for the simulated hardware running each type of workload Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Performance for DSS workload • Static (global): • Generates a single cost function for a group of queries • Obtains the layout most suitable for all the queries • Static (local): • Uses a single query to produce a layout • Produces a very optimistic layout • Dynamic: • Starts with an optimized layout • Adapts layout as queries run on the system Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Cost functions • Inter-node data transferred • Sum of all data blocks that are accessed from a remote disk • Time estimate • Time to access local/remote data • Time to operate on the data Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Quality of the solution: steps • If we reduce the temperature slowly, we can achieve a better schedule • This comes at the cost of extra time • Around 0.87 seconds of think-time for 50 steps Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Summary • We phrase the data placement problem as a combinatorial optimization problem • We propose a technique that uses simulated annealing to generate an initial data layout based on the expected usage of the data • We extend the simulated annealing technique to reorganize the data at run-time • We take advantage of the locality of data references to improve the effectiveness of the reorganization Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Thank you • For additional information http://www.ece.utexas.edu/projects/ece/lca/ Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement