1 / 26

A Workflow-Aware Storage System

Emalayan Vairavanathan. A Workflow-Aware Storage System. Samer Al- Kiswany , Lauro Beltrão Costa, Zhao Zhang, Daniel S. Katz, Michael Wilde, Matei Ripeanu. Workflow Example - ModFTDock. Protein docking application Simulates a more complex protein model from two known proteins

brilliant
Download Presentation

A Workflow-Aware Storage System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EmalayanVairavanathan A Workflow-Aware Storage System • Samer Al-Kiswany, LauroBeltrão Costa, Zhao Zhang, Daniel S. Katz, • Michael Wilde, MateiRipeanu

  2. Workflow Example - ModFTDock • Protein docking application • Simulates a more complex protein model from two known proteins • Applications • Drugs design • Protein interaction prediction

  3. App. task App. task App. task App. task App. task Local storage Local storage Local storage Local storage Local storage Background – ModFTDock in Argonne BG/P 1.2 M Docking Tasks Workflow Runtime Engine File based communication Large IO volume Scale: 40960 Compute nodes IO rate : 8GBps = 51KBps / core Backend file system (e.g., GPFS, NFS)

  4. Background – Backend Storage Bottleneck • Storage is one of the main bottlenecks for workflows Scheduling and Idle 40% Montage workflow (512 BG/P cores, GPFS backend file system) Source [Zhao et. al]

  5. App. task App. task App. task Local storage Local storage Local storage Intermediate Storage Approach Workflow Runtime Engine Source [Zhao et. al] MTAGS 2008 Scale: 40960 Compute nodes … POSIX API Stage Out Intermediate Storage Stage In Backend file system (e.g., GPFS, NFS)

  6. Research Question How can we improve the storage performance for workflow applications?

  7. IO-Patterns in Workflow Applications – by Justin Wozniak et al PDSW’09 • Pipeline • Broadcast • Reduce • Scatter • and Gather • Locality and • location-aware scheduling • Replication • Collocation and • location-aware scheduling • Block-level data placement

  8. IO-Patterns in ModFTDock Stage - 1 Broadcast pattern Stage - 2 Reduce pattern ModFTDock Stage - 3 Pipeline pattern • 1.2 M Dock, 12000 Merge and Score instances at large run • Average file size 100 KB– 75 MB

  9. Research Question How can we improve the storage performance for workflow applications? Our Answer Workflow-aware storage: Optimizing the storage for IO patterns • Traditional approach: One size fits all • Our approach: File / block-level optimizations

  10. App. task App. task App. task Local storage Local storage Local storage Integrating with the workflow runtime engine Application hints (e.g., indicating access patterns) Workflow Runtime Engine Compute Nodes … POSIX API Storage hints (e.g., location information) Workflow-aware storage (shared) Stage In/Out Backend file system (e.g., GPFS, NFS)

  11. Outline • Background • IO Patterns • Workflow-aware storage system: Implementation • Evaluation

  12. Implementation: MosaStore • File is divided into fixed size chunks. • Chunks: stored on the storage nodes. • Manager maintains a block-map for each file • POSIX interface for accessing the system MosaStore distributed storage architecture

  13. Implementation: Workflow-aware Storage System Workflow-aware storage architecture

  14. Implementation: Workflow-aware Storage System • Optimized data placement for the pipeline pattern • Priority to local writes and reads • Optimized data placement for the reduce pattern • Collocating files in a single storage node • Replication mechanism optimized for the broadcast • pattern • Parallel replication • Exposing file location to workflow runtime engine

  15. Outline • Background • IO Patterns • Workflow-aware storage system: Implementation • Evaluation

  16. App. task App. task App. task Local storage Local storage Local storage Evaluation - Baselines Local storage Compute Nodes • MosaStore, NFS and • Node-local storage • vs Workflow-aware storage … MosaStore Intermediate storage (shared) Workflow-aware storage Stage In/Out NFS Backend file system (e.g., GPFS, NFS)

  17. Evaluation - Platform • Cluster of 20 machines. • Intel Xeon 4-core, 2.33-GHz CPU, 4-GB RAM, 1-Gbps NIC, and a RAID-1 on two 300-GB 7200-rpm SATA disks • Backend storage NFS server • Intel Xeon E5345 8-core, 2.33-GHz CPU, 8-GB RAM, 1-Gbps NIC, and a 6 SATA disks in a RAID 5 configuration NFS server is better provisioned

  18. Evaluation – Benchmarks and Application Synthetic benchmark • Application and workflow run-time engine • ModFTDock

  19. Synthetic Benchmark - Pipeline • Optimization: Locality and location-aware scheduling Average runtime for medium workload

  20. Synthetic Benchmarks - Reduce • Optimization: Collocation and location-aware scheduling Average runtime for medium workload

  21. Synthetic Benchmarks - Broadcast • Optimization: Replication Average runtime for medium workload

  22. Not everything is perfect ! Average runtime for small workload (pipeline, broadcast and reduce benchmarks)

  23. Evaluation – ModFTDock Total application time on three different systems ModFTDock workflow

  24. Evaluation – Highlights • WASS shows considerable performance gain with all the benchmarks on medium and large workload (up to 18x faster than NFS and up to 2x faster than MosaStore). • ModFTDock is 20% faster on WASS than on MosaStore, and more than 2x faster than running on NFS. • WASS provides lower performance with small benchmarks due to metadata overheads and manager latency.

  25. Summary • Problem • How can we improve the storage performance for workflow applications? • Approach • Workflow aware storage system (WASS) • From backend storage to intermediate storage • Bi-directional communication using hints • Future work • Integrating more applications • Large scale evaluation

  26. THANK YOU MosaStore:netsyslab.ece.ubc.ca/wiki/index.php/MosaStore Networked Systems Laboratory: netsyslab.ece.ubc.ca

More Related