210 likes | 391 Views
Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads. Iraklis Psaroudakis (EPFL) , Tobias Scheuer (SAP AG), Norman May (SAP AG), Anastasia Ailamaki (EPFL). Scheduling for high concurrency. Queries >> H/W contexts. Limited number of H/W contexts.
E N D
Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads IraklisPsaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May (SAP AG), Anastasia Ailamaki (EPFL)
Scheduling for high concurrency Queries >> H/W contexts Limited number of H/W contexts How should the DBMS use available CPU resources?
Scheduling for mixed workloads OLTP OLAP Long-running Read-only Scan-heavy Short-lived Reads & updates Scan-light Single thread Parallelism Contention in highly concurrent situations How to schedule highly concurrent mixed workloads?
Scheduling tactics Context switch Overutilization Cache thrashing 1 1 2 2 3 1 Time Time Time 2 3 • Admission control 1 1 Coarse granularity of control 2 2 3 # Threads } overutilization # H/W contexts } underutilization 1 1 Time 2 2 3 We need to avoid both underutilization and overutilization OS scheduler
Task scheduling run() { ... } • One worker thread per core processing tasks Socket 1 Socket 2 Task queues • Distributed queues to minimize sync contention • Task stealing to fix imbalance • OLAP queries can parallelize w/o overutilization Provides a solution to efficiently utilize CPU resources A task can contain any code
Task scheduling problems for DBMS • OLTP tasks can block • Problem: under-utilization of CPU resources • Solution: flexible concurrency level • OLAP queries can issue an excessive number of tasks in highly concurrent situations • Problem: unnecessary scheduling overhead • Solution: concurrency hint
Outline Introduction Flexible concurrency level Concurrency hint Experimental evaluation with SAP HANA Conclusions
Fixed concurrency level Bypasses the OS scheduler Fixed • OLTP tasks may block Underutilization A fixed concurrency level is not suitable for DBMS Typical task scheduling:
Flexible concurrency level OS The OS schedules the threads Concurrency level = # of worker threads Active Concurrency level = # of active worker threads Active concurrency level = # H/W contexts Issue additional workers when tasks block Cooperate with the OS scheduler
Worker states Task Scheduler Task Scheduler Inactive Workers Active workers Inactive by user Parked workers Watchdog Other threads Blocked in syscall Waiting for a task Wedynamicallyre-adjustthescheduler'sconcurrencylevel Watchdog: • Monitoring, statistics, and takes actions • Keeps active concurrency level ≈ # of H/W contexts
Outline Introduction Flexible concurrency level Concurrency hint Experimental evaluation with SAP HANA Conclusions
Partitionable operations Partition 1 1 ≤ # tasks ≤ # H/W contexts Σ Partition 2 Final result Partition 3 Calculates its task granularity • Problem: calculation independent of the system’s concurrency situation • High concurrency: excessive number of tasks Unnecessary scheduling overhead We should restrict task granularity under high concurrency Can be split in a variable number of tasks
Restricting task granularity free worker threads = max(0, # of H/W contexts - # active worker threads) concurrency hint = exponential moving average of free worker threads The concurrency hint serves as an upper bound for # tasks Existing frameworks for data parallelism • Not straightforward for a commercial DBMS • Simpler way?
Concurrency hint High concurrency situations Low concurrency situations Concurrency hint 1 Concurrency hint # H/W contexts Σ Σ Σ Σ Σ Low latency High latencyLow scheduling overhead Higher throughput Lightweight way to restrict task granularity under high concurrency
Outline Introduction Flexible concurrency level Concurrency hint Experimental evaluation with SAP HANA Conclusions
Experimental evaluation with SAP HANA • TPC-H SF=10 • TPC-H SF=10 + TPC-C WH=200 • Configuration: • 8x10 Intel Xeon E7-8870 2.40 GHz, with hyperthreading, 1 TB RAM, 64-bit SMP Linux (SuSE) 2.6.32 kernel • Several iterations. No caching. No thinking times. • We compare: • Fixed (fixed concurrency level) • Flexible (flexible concurrency level) • Hint(flexible concurrency level + concurrency hint)
TPC-H – Response time Task granularitycanaffect OLAP performanceby 11%
TPC-H - Measurements Unnecessary overhead by too many tasks under high concurrency
TPC-H - Timelines Hint Fixed
TPC-H and TPC-C Throughput experiment • Variable TPC-H clients = 16-64. TPC-C clients = 200.
Conclusions Thank you! Questions? • Task scheduling for • Resources management • For DBMS • Handle tasks that block • Solution: flexible concurrency level • Correlate task granularity of analytical queries with concurrency to avoid unnecessary scheduling overhead • Solution: concurrency hint