1 / 28

Peer-to-Peer Technology in Grid Computing – The Igor File System and Beyond

Peer-to-Peer Technology in Grid Computing – The Igor File System and Beyond. April 2008. Kendy Kutzner, and Thomas Fuhrmann. Faculty of Informatics, System Architecture Group, Universität Karlsruhe (TH), Germany

kory
Download Presentation

Peer-to-Peer Technology in Grid Computing – The Igor File System and Beyond

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Peer-to-Peer Technology in Grid Computing – The Igor File System and Beyond April 2008 Kendy Kutzner, and Thomas Fuhrmann Faculty of Informatics, System Architecture Group,Universität Karlsruhe (TH), Germany Department of Informatics, Chair of Network Architectures, Technical University Munich, Germany

  2. The Pharma Challenge (1) ULB GSK etc. EMBL Various sites produce public and private data that is stored and exchanged as flat files, typically by FTP. In order to work with the data, sites need to download all the external data before they can start to work, even though they need only a small fraction of the data.

  3. The Pharma Challenge (2) Update at B Update at A Update at A Time Download from A Download from A Preprocess data A Preprocess data A Preprocess data A Download from B Preproc. data B Local data is up to date Result relates to outdated version of B This becomes increasingly difficult: up to 1000 sources, weekly changes, preprocessing time about a day.

  4. The IGOR-FS Solution Today, sites use only a few sources in practice. Ideally, their system could handle cross-references between these sites. But in fact they use only a small part of that data, and they reload the whole data even though it did not change much. Data downloaded so far … Data from site A Data that is actually used. Data that actually changed. … with IGOR-FS

  5. Agenda 1. Motivation 2. Peer-to-Peer Overlays 3. The Igor File System 4. Outlook & Summary

  6. Application Data TCP/IP What‘s Peer-to-Peer? (1) Data Schicht C Schicht C Data Schicht B Schicht B Data Schicht A Schicht A Station A Station B Physical Medium

  7. What‘s Peer-to-Peer? (2) planned & administrated Self-organizing Client-Server Peer-to-Peer Organic Growth • Each machine contributes. • Joining devices provide the ressources they consume Robust and Fault-Tolerant • No single point of failure Research Question: • Which rules make the peer-to-peer system behave as desired? Peer-to-Peer …

  8. Distributed Hash Tables (Application) Key Hash-Function Hash Space [0 - 2128)

  9. Distributed Hash Tables Hash Space [0 - 2128)

  10. Trade-Off with Structured Routing Overlays Per nodestate Full mesh O(N) Chord O(log N) Ring O(1) Per messagesteps O(1) O(log N) O(N)

  11. Chord – A Distributed Hash Table Hash Space [0 - 2128)

  12. Chord – A Structured Routing Overlay

  13. Chord – A Structured Routing Overlay (2)

  14. Proximity Awareness with Chord far Internet close

  15. Network Coordinates Create a map of nodes 1ms 15ms TCP Connection 2ms

  16. Agenda 1. Motivation 2. Peer-to-Peer Overlays 3. The Igor File System 4. Outlook & Summary

  17. Igor File System Igor FS sits on top of FUSE. Thus it appears as normal file system to the operating system.

  18. Chopping up Files Chunks boundaries are chosen according to content, not file position. Files stored in Igor FS are chopped into chunks. Inserting data does not necessarily break chunk boundaries.

  19. Encryption of Data Hash 1  Identifier Hash 2  Crypto Key Reader needs ID to find the block and the key to decrypt it.

  20. File System Structure Root block:Mount via (ID, Key) Tuple From there, all folders and files can be read recursively.

  21. Backup and Versioning • Blocks are identified by their hashed content. • Thus, modifying the block, changes the ID. • Thus, blocks are immutable. • Writing to Igor FS creates a new root block. • This means that nothing gets lost as long as you have the (ID,key) tuple.

  22. Root Key Distribution Root owner distributes new (ID,key) tuples periodically.

  23. IgorFS Performance • Caveat: • So far, Igor FS is a prototype system only (PhD thesis project). • It is being used in an EU project by several pharma research institutions. • Reading (block-wise) 23.5 MByte/s • Writing (block-wise) 87.6 MByte/s • Note that actual persistence on hard disk and distribution in the network is asynchronous. Thus data rates drop only when the IgorFS cache memory of the machine is exhausted. • Note that • NFS in the same setting (local LAN) has read/write performance of about 20 MByte/s • EXT3 on a local hard disk has 100MByte/s and above • Conclusion: • IgorFS already levels with other distributed file systems. • But it can not and will not outperform local file systems.

  24. Agenda 1. Motivation 2. Peer-to-Peer Overlays 3. The Igor File System 4. Outlook & Summary

  25. Outlook to Future Enhancements • So far, only one writer. • In future: Multiple writers! – Need to solve consistency issues. • So far, data pulled by the reader. • In future: Important data pushed by the writer to ensure redundant copies exists as soon as possible. • So far, all data kept forever. • In future: Only that data kept that readers will request in future.

  26. Outlook to Future Enhancements • Compute processes are just data: • Code, heap, stack, execution frames, … • Our Igor system move data to the place where it is needed. • Can it move around threads, too? • We believe, it can! • Ongoing project to use our technology in sensor actuator networks. • New project to use this technology with IBM cell processor. (Grant currently under negotiation)

  27. IGOR-FS Feature Summary IGOR-FS works automatically and fully transparently as a distributed file system – as if you had magically already downloaded everything you need. Distributed storage Strong cryptography Backup & versioning

  28. Thank you!Questions? Thomas FuhrmannCS VIII – Network ArchitecturesTechnical University Munich, Germany IBDS System ArchitectureUniversity of Karlsruhe, Germanyfuhrmann@net.in.tum.de

More Related