160 likes | 260 Views
Active Flash: Towards Energy-Efficient, In-Situ Data Analytics on Extreme-Scale Machines. Devesh Tiwari , Sudharshan S. Vazhkudai , Youngjae Kim, Xiaosong Ma, Simona Boboila , and Peter J. Desnoyers. Kilmo Choi rlfah926@naver.com. Contents. Background Problems and Challenges
E N D
Active Flash: Towards Energy-Efficient, In-Situ Data Analytics on Extreme-Scale Machines DeveshTiwari, Sudharshan S. Vazhkudai, YoungjaeKim, XiaosongMa, SimonaBoboila, and Peter J. Desnoyers Kilmo Choi rlfah926@naver.com
Contents • Background • Problems and Challenges • Active Flash Approach for In-situ • Active Computation Feasibility • Evaluation • ActiveFlash Prototype based on OpenSSD Platform • Conclusion
Background • Scientific Discovery : Two-Step Scientific Simulation Data Analysis and Visualization Scientific Discovery
Background • Large-scale leadership computing applications produce big data • GTC produces ~30TB output data per hour at-scale.
Problems and Challenges • Offline approach suffers from both performance and energy inefficiencies • Redundant I/O(simulations write, analyses read) • Excessive data movement • Extra energy cost • Energy efficiency will become the primary metric for system design, as compute power is expected to increase by x1000 in the next decade with only a x10 increase in power envelope • Using simulation nodes for data analysis not acceptable
Active Flash Approach for In-situ • SSDs now being adopted in Supercomputers(e.g. Tsbame, Gordon) • higher I/O throughput and storage capability • SSD controllers becoming increasingly powerful • multi-core low-power processors • Idle cycles at SSD controllers • In-situ analysis • analysis on in-transit output data, before it is written to the PFS • eliminates redundant I/O, but it use expensive compute nodes
Active Flash Approach for In-situ • Active flash • In-situ analysis on SSDs • Exploit the computation at idle cycles of the SSD controller • Reduce transfer costs • high performance and energy saving
Active Flash Approach for In-situ • Three approach to data analysis • offline • active flash • analysis node
Active Computation Feasibility • Modeling SSD Deployment • Multiple constraints Capacity • Enough SSDs to sustain output burst Performance • High I/O bandwidth to SSD space • Fast restart from application checkpoints Write durability • SSD write endurance limits
Active Computation Feasibility • Staging Ratio • How many simulation nodes share one common SSD?
Active Computation Feasibility • Modeling active computation feasibility • Relatively less compute intensive kernels better suited for active computation(e.g. regex matching) • Dependent on multiple factors : simulation data production rate, staging ratio, I/O bandwidth, etc.
Evaluation • Cray XT5 Jaguar supercomputer • Samsung PM830 SSD • Intel Core i7 processors
Evaluation • Feasibility of the analysis node approach • Most data analysis kernels can be placed on SSD controllers without degrading simulation performance • Additional SSDs are not required for supporting in-situ data analysis on SSDs • Analysis node approach is feasible at higher staging ratios, but at additional infrastructure cost
Evaluation • Energy and cost saving analysis • Staging ratio = 10 • Active Flash and offline approach : y1 analysis node : y2 • Offline model consumes more energy due to the I/O wait time
Conclusion • Extant approaches to scientific data analysis(e.g. offline and analysis nodes) are stymied by several inefficiencies in data movement and energy consumption that results in sub-optimal performance • Active flash is better than either approaches for all of the aforementioned metrics