1 / 16

Application Focus

Dealing with Data: Choosing a Good Storage Technology for Your Application Rick Wagner HPC Systems Manager July 1st, 2014. Storage choices should be driven by application need, not just what’s available. Application Focus. But, applications need to adapt as they scale.

uriah
Download Presentation

Application Focus

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dealing with Data:Choosing a Good Storage Technologyfor Your ApplicationRick WagnerHPC Systems ManagerJuly 1st, 2014

  2. Storage choices should be driven byapplication need, not just what’s available. Application Focus But, applications need to adaptas they scale. Writing a few small files to an NFS server is fine…writing 1000’s simultaneously willwipe out the server. If you use binary files, don’t invent your own format.Consider HDF5.

  3. Devices Services File Systems Storage Technologies memory Cloud ext4 block MySQL NFS CouchDB Lustre PVFS FUSE

  4. Devices Services File Systems Storage Technologies memory Cloud ext4 block MySQL NFS CouchDB Lustre PVFS Each has its own performance characteristics Not all are available everywhere FUSE

  5. Classic access, POSIX, Windows • Most relevant: • Local • Remote • NFS, CIFS • Parallel (Lustre, GPFS) • Local file systems are good for small and temporary files • Network file systems very convenient for sharing databetween systems File Systems

  6. Parallel File Systems

  7. Parallel File Systems TRESTLES IB cluster GORDON IB cluster TRITON Myrinet cluster 3 Distinct Network Architectures Mellanox 5020 Bridge 12 GB/s 64 Lustre LNET Routers 100 GB/s Myrinet 10G Switch 25 GB/s MDS Redundant Switches for Reliability and Performance Arista 7508 10G Arista 7508 10G MDS MDS 32 OSS (Object Storage Servers) Provide 100GB/s Performance and >4PB Raw Capacity OSS 72TB OSS 72TB OSS 72TB OSS 72TB Metadata Servers

  8. A Cautionary Tale http://www.youtube.com/watch?v=gDfLXAtRJfY&feature=youtu.be

  9. Raw block device (/dev/sdb) or RAM FS (/dev/shm) Devices Useful in specific cases, like fast scratch Can be very good for small I/O

  10. Services Things accessed programmatically Frequents the last thought for HPCapplications: A MISTAKE Databases Cloud storage (Amazon S3) Document storage (MongoDB, CouchDB)

  11. Know What You Need http://www.youtube.com/watch?v=F4OIDszDA9E

  12. Order of Magnitude Guide

  13. My application needs to: Choosing Write a checkpoint dump from memory from a large parallel simulation. I should consider: A parallel file system and a binary file formatlike HDF5.

  14. My application needs to: Choosing Run analysis on remote systems and return the results to a web portal for users. I should consider: Cloud storage for results and input, and local scratch space for the job.

  15. My application needs to: Choosing Randomly access many small files, or read and write small blocks from large files. I should consider: A database, RAM FS, or local scratch space.

  16. Many Boxes Make a Sad Panda http://www.youtube.com/watch?v=N2zK3sAtr-4 Database logos courtesy of RRZEicons http://commons.wikimedia.org/

More Related