1 / 12

FiVO/QStorMan: toolkit for supporting data-oriented applications in PL-Grid

Explore FiVO/QStorMan toolkit for managing data in Grid applications, defining non-functional requirements, testing scenarios, and implementation goals. Enhance data-intensive applications in PL-Grid with efficient storage solutions.

cookrobert
Download Presentation

FiVO/QStorMan: toolkit for supporting data-oriented applications in PL-Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FiVO/QStorMan: toolkit for supporting data-oriented applications in PL-Grid R. Slota (1,2), D. Król (1), K. Skalkowski (1), B. Kryza (1), D. Nikolow (1,2), and J. Kitowski (1,2) (1) ACC Cyfronet AGH, Kraków, Poland (2) Institute of Computer Science AGH-UST, Krakow, Poland KU KDM 2011 Zakopane, 9-11.03.2011

  2. Agenda Data intensive applications Research and implementation goals Non-functional requirements in data management FiVO/QStorMan toolkit components and architecture FiVO/QStorMan usage Testing scenarios Results Conclusions

  3. Data intensive applications Main features: • Generate gigabytes (or more) of data per day. • Different types of data which require different types of storage. • Heavily uses read/write operations. • The run time of an application heavily depends on storage access time and transfer speed rather than the computation time. Scientific examples (from wikipedia): • The LHC experiment produces 15 PB/year = ~42 TB/day = ~1 GB/s • The German Climate Computing Center (DKRZ) has a storage capacity of 60 petabytes of climate data.

  4. Research and implementation goals The main objective of the presented research is to manage the data coming from Grid applications using the following concepts: • allowing users to define non-functional requirements for storage devices explicitly, • exploiting a knowledge base of the VO extendedwith descriptions of storage elements • exploiting information from storage monitoring systems and VO knowledge base to find the most suitable storage device complient with the defined requirements

  5. Non-functional requirements in data management • Data intensive applications may have different requirements, e.g. important data should be replicated • Abstraction of storage elements prevents users from influencing the actual location of data • Distribution ofdata among available storage elements according to the defined requirements Sample non-functional requirements: • freeCapacity • currentReadTransferRate • averageWriteTransferRate

  6. FiVO/QStorMan toolkit

  7. FiVO/QStorMan usage 1. Using QStorMan portal: • Declare your non-functional requirements in the QStorMan portlet. • Copy and paste the returned text from the portlet to your JDL file. 2. Using C++ programming library (libses): #include <LustreManager.h> #include <StoragePolicyFactory.h> using namespace lustre_api_library; LustreManager manager; StoragePolicy policy; policy.setAverageReadTransferRate(50); policy.setCapacity(100); int descriptor = manager.createFile(„nazwa_pliku.dat”, &policy); 3. Using system C library: • declare your non-functional requirements in the GOM knowledge base • export LD_PRELOAD=<path_to_libses_wrapper_librart>

  8. FiVO/QStorMan testing environment ACC Cyfronet AGH (Cracow) • Scientific Linux SL release 5.5 (Boron) • 2x Intel(R) Xeon(R) CPU L5420 @ 2.50GHz (4 cores, 1 thread per core) • 16056 MB RAM • ~ 12 TB storage capacity, ~150 MB/s read transfer rate, ~70 MB/s write transfer rate PCSS (Poznan): • Scientific Linux CERN SLC release 5.5 (Boron) • Intel(R) Xeon(R) CPU 5160 @ 3.00GHz (2 cores, 1 thread per core) • 1000 MB RAM • ~ 14 TB storage capacity, ~55 MB/s read trasfer rate, ~46 MB/s write trasfer rate ICM (Warsaw): • CentOS release 5.5 (Final) • Intel(R) Xeon(R) CPU X3430 @ 2.40GHz (4 cores, 1 thread per core) • 7975 MB RAM • ~ 5 TB storage capacity, ~50 MB/s read trasfer rate, ~27 MB/s write trasfer rate

  9. Testing scenario Scenario: Aims to simulate a Grid job which is scheduled to run in the most suitable data center. The job performs computation and then writes data. Scenario parameters: • Number of users – 3 (2 users used QStorMan) and 4 • File size – 512 MB • Number of files to write – 20 , 30 , 40 Scenario: Aims to simulate a Grid job which is scheduled to run in the most suitable data center. The job performs computation and then writes data. Scenario parameters: • Number of users – 3 (2 users used QStorMan) • File size – 512 MB • Number of files to write – 20 , 30 , 40

  10. Results

  11. Conclusions and Future work • The presented research goal is to develop new approaches to issues of storage management in the Grid environment • Explicit definitions of non-functional requirements are necessary in data intensive applications • Allowing to accelerate data-oriented Grid applications by ~45% without any modifications in source code Future work: • Integration with Grid Queuing Systems • Integration with Virtual Organizations

  12. Do you want to know more ? www.plgrid.pl or dkrol@agh.edu.pl

More Related