1 / 24

EGEE is a project funded by the European Union under contract IST-2003-508833

LFC and Fireman Performance Measurements Caitriana Nicholson / IT-GD, CERN / Glasgow University Craig Munro / IT-GD, CERN / Brunel University. LCG Storage Management Workshop, CERN, 7 th April 2005. EGEE is a project funded by the European Union under contract IST-2003-508833. Overview.

gordy
Download Presentation

EGEE is a project funded by the European Union under contract IST-2003-508833

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LFC and Fireman Performance MeasurementsCaitriana Nicholson / IT-GD, CERN / Glasgow UniversityCraig Munro / IT-GD, CERN / Brunel University LCG Storage Management Workshop, CERN, 7th April 2005 EGEE is a project funded by the European Union under contract IST-2003-508833

  2. Overview Aims of Testing Test Methodology & Setup LFC Performance Results FiReMan Performance Results Conclusions

  3. Aims of Testing Data Challenges of 2004 exposed limitations in LCG Data Management tools LCG File Catalog developed to address problems with the RLS Suite of tests developed to check the functionality and performance of the LFC Comparison required of performance of EDG RLS Globus RLS LFC FiReMan

  4. Test Methodology Multi-threaded C client program written to test each type of operation (insert, query, delete etc) ./create_files -d /grid/dteam/caitriana/insert/ -f $num_files -t $num_threads C programs wrappered by Perl scripts Typically, each operation performed several thousand times in the client program ($num_files=3000) and mean result returned Client program called several times from Perl script and mean result taken Any entries removed before next test run

  5. LFC Test Setup LFC Globus RLS EDG RLS Server 420 810 420 Single Client 420 220 420 Multiple Clients 400 220 420 • Oracle DB on Xeon 2.4GHz • PIII 1GHz, 512MB server running 20 threads • PIII 853MHz, 128MB client with configurable number of threads (single client tests) • 10 x PIII 1GHz, 512MB clients with configurable number of threads (multiple clients) • 100 Mb/s LAN • Insecure LFC • Comparison against published results for insecure catalogues • Security overheads now being tested • Quality of machines used should be noted when comparing LFC and RLS results! • SPEC CINT2000 values:

  6. LFC Performance (i) - Inserts • Mean insert time as number of entries increased up to 40M remains below 30 ms • EDG mean insert time was ~40 ms with 500,000 entries • Insert rate, with increasing number of client threads, for ~1M entries • Increases up to ~200 adds/sec up to server thread limit • Globus RLS gave ~84 adds/sec when run with consistency

  7. LFC Performance (ii) - Queries Rate of querying for a single LFN, increasing number of client threads, ~1M entries Not really comparable with Globus results RLS does 1-to-1 lookup LFC stat() returns system metadata, checks permissions… EDG RLS rate ~63 queries/sec for 1 thread; LFC with 1 thread gives ~90 queries/sec

  8. LFC Performance (ii) - Queries • Time to list and stat all replicas of a file proportional to number of replicas • Time to read a directory is directly proportional to directory size

  9. LFC Performance (ii) - Queries • Default buffer size in LFC is small (4 KB) • Tuning the buffer size leads to improved performance for readdir() • Time to read directory of 100000 entries measured with varying buffer sizes • If bulk queries implemented, they should show similar behaviour

  10. LFC Performance (iii) – Delete Rates • Delete rate, with increasing number of client threads, for ~1M entries • With 1 thread, time per delete ~16 ms • EDG RLS with 1 thread gave ~30 ms per delete • No comparable results for Globus RLS

  11. Performing a chdir() before many simultaneous operations in the same directory improves performance significantly Using transactions gave loss of performance with single client Extra time being spent in DB Requires further investigation LFC Performance (iv) – chdir & transactions

  12. LFC Performance (v) - Multiple Client Tests Tests run simultaneously on varying number of client machines Clients all running with 10 threads Insert and query rates measured with: Single operations Transactions (100 operations per transaction) No reduction in operation rate as number of clients increases Single Operations Transactions

  13. LFC Summary LFC has been tested and shown to be scalable to at least: 40 million entries 100 client threads Performance improved with comparison to RLSs Stable : Continuous running at high load for extended periods of time with no crashes Based on code which has been in production for > 4 years Tuning required to improve bulk performance

  14. FiReMan Test Setup LFC and FiReMan tests performed on identical hardware Catalogue Server and Oracle DB running on same machine Dual Xeon 2.4Ghz with 2048 MB RAM PIII 800 Mhz, 512 MB RAM Client with configurable number of threads 100 Mb/s LAN Insecure FiReMan and LFC

  15. Limitations of the Setup Limited time -> limited scope of tests Only one multi-threaded client used to test server. Difficult to explore all limits of server All tests performed over LAN, need to look at WAN Influence of round trip time

  16. LFC Comparison LFC Tests repeated on identical hardware to give fair comparison LFC Tests performed with relative paths (chdir) and without transactions 240 350 220 300 200 180 250 160 Queries / Second Inserts / Second 200 140 120 150 100 100 LFC - New 80 LFC - New LFC - Old LFC - Old 60 50 1 2 5 10 20 50 100 1 2 5 10 20 50 100 Number of Threads Number of Threads

  17. FiReMan Performance - Inserts Inserted ~1M entries in bulk with insert time ~5ms Insert Rate for different bulk sizes Single Bulk1 Bulk 10 Bulk 100 Bulk 500 Bulk 1000 Bulk 5000 350 300 250 Inserts / Second 200 150 100 50 0 1 2 5 10 20 50 Number Of Threads

  18. FiReMan Performance - Insert Comparison with LFC: 350 Fireman - Single Entry Fireman - Bulk 100 300 LFC 250 200 Inserts / Second 150 100 50 0 1 2 5 10 20 50 100 Number of Threads

  19. FiReMan Performance - Queries Query Rate for an LFN 1200 Fireman Single Fireman Bulk 1 Fireman Bulk 10 1000 Fireman Bulk 100 Fireman Bulk 500 Fireman Bulk 1000 Fireman Bulk 5000 800 Entries Returned / Second 600 400 200 0 5 10 15 20 25 30 35 40 45 50 Number Of Threads

  20. FiReMan Performance - Queries Comparsion with LFC: 1200 Fireman - Single Entry Fireman - Bulk 100 LFC 1000 800 Entries Returned / Second 600 400 200 0 1 2 5 10 20 50 100 Number Of Threads

  21. FiReMan Performance - Delete Rate LFNs can be deleted from catalogue 240 Single Bulk 1 220 Bulk 10 Bulk 100 200 Bulk 500 Bulk 1000 180 Bulk 5000 160 Deletes / Second 140 120 100 80 60 40 20 1 2 5 10 20 50 Number of Threads

  22. FiReMan Performance - Delete Comparison with LFC: 240 Fireman Single Fireman Bulk 100 220 LFC 200 180 160 140 Deletes / Second 120 100 80 60 40 20 1 2 5 10 20 50 Number of Threads

  23. Conclusions Both LFC and FiReMan offer large improvements over RLS Still some issues remaining: Scalability of FiReMan Bulk Entry for LFC More work needed to understand performance and bottlenecks Need to test some real Use Cases

  24. Questions?

More Related