280 likes | 430 Views
Peer-to-Peer Technology in Grid Computing – The Igor File System and Beyond. April 2008. Kendy Kutzner, and Thomas Fuhrmann. Faculty of Informatics, System Architecture Group, Universität Karlsruhe (TH), Germany
E N D
Peer-to-Peer Technology in Grid Computing – The Igor File System and Beyond April 2008 Kendy Kutzner, and Thomas Fuhrmann Faculty of Informatics, System Architecture Group,Universität Karlsruhe (TH), Germany Department of Informatics, Chair of Network Architectures, Technical University Munich, Germany
The Pharma Challenge (1) ULB GSK etc. EMBL Various sites produce public and private data that is stored and exchanged as flat files, typically by FTP. In order to work with the data, sites need to download all the external data before they can start to work, even though they need only a small fraction of the data.
The Pharma Challenge (2) Update at B Update at A Update at A Time Download from A Download from A Preprocess data A Preprocess data A Preprocess data A Download from B Preproc. data B Local data is up to date Result relates to outdated version of B This becomes increasingly difficult: up to 1000 sources, weekly changes, preprocessing time about a day.
The IGOR-FS Solution Today, sites use only a few sources in practice. Ideally, their system could handle cross-references between these sites. But in fact they use only a small part of that data, and they reload the whole data even though it did not change much. Data downloaded so far … Data from site A Data that is actually used. Data that actually changed. … with IGOR-FS
Agenda 1. Motivation 2. Peer-to-Peer Overlays 3. The Igor File System 4. Outlook & Summary
Application Data TCP/IP What‘s Peer-to-Peer? (1) Data Schicht C Schicht C Data Schicht B Schicht B Data Schicht A Schicht A Station A Station B Physical Medium
What‘s Peer-to-Peer? (2) planned & administrated Self-organizing Client-Server Peer-to-Peer Organic Growth • Each machine contributes. • Joining devices provide the ressources they consume Robust and Fault-Tolerant • No single point of failure Research Question: • Which rules make the peer-to-peer system behave as desired? Peer-to-Peer …
Distributed Hash Tables (Application) Key Hash-Function Hash Space [0 - 2128)
Distributed Hash Tables Hash Space [0 - 2128)
Trade-Off with Structured Routing Overlays Per nodestate Full mesh O(N) Chord O(log N) Ring O(1) Per messagesteps O(1) O(log N) O(N)
Chord – A Distributed Hash Table Hash Space [0 - 2128)
Proximity Awareness with Chord far Internet close
Network Coordinates Create a map of nodes 1ms 15ms TCP Connection 2ms
Agenda 1. Motivation 2. Peer-to-Peer Overlays 3. The Igor File System 4. Outlook & Summary
Igor File System Igor FS sits on top of FUSE. Thus it appears as normal file system to the operating system.
Chopping up Files Chunks boundaries are chosen according to content, not file position. Files stored in Igor FS are chopped into chunks. Inserting data does not necessarily break chunk boundaries.
Encryption of Data Hash 1 Identifier Hash 2 Crypto Key Reader needs ID to find the block and the key to decrypt it.
File System Structure Root block:Mount via (ID, Key) Tuple From there, all folders and files can be read recursively.
Backup and Versioning • Blocks are identified by their hashed content. • Thus, modifying the block, changes the ID. • Thus, blocks are immutable. • Writing to Igor FS creates a new root block. • This means that nothing gets lost as long as you have the (ID,key) tuple.
Root Key Distribution Root owner distributes new (ID,key) tuples periodically.
IgorFS Performance • Caveat: • So far, Igor FS is a prototype system only (PhD thesis project). • It is being used in an EU project by several pharma research institutions. • Reading (block-wise) 23.5 MByte/s • Writing (block-wise) 87.6 MByte/s • Note that actual persistence on hard disk and distribution in the network is asynchronous. Thus data rates drop only when the IgorFS cache memory of the machine is exhausted. • Note that • NFS in the same setting (local LAN) has read/write performance of about 20 MByte/s • EXT3 on a local hard disk has 100MByte/s and above • Conclusion: • IgorFS already levels with other distributed file systems. • But it can not and will not outperform local file systems.
Agenda 1. Motivation 2. Peer-to-Peer Overlays 3. The Igor File System 4. Outlook & Summary
Outlook to Future Enhancements • So far, only one writer. • In future: Multiple writers! – Need to solve consistency issues. • So far, data pulled by the reader. • In future: Important data pushed by the writer to ensure redundant copies exists as soon as possible. • So far, all data kept forever. • In future: Only that data kept that readers will request in future.
Outlook to Future Enhancements • Compute processes are just data: • Code, heap, stack, execution frames, … • Our Igor system move data to the place where it is needed. • Can it move around threads, too? • We believe, it can! • Ongoing project to use our technology in sensor actuator networks. • New project to use this technology with IBM cell processor. (Grant currently under negotiation)
IGOR-FS Feature Summary IGOR-FS works automatically and fully transparently as a distributed file system – as if you had magically already downloaded everything you need. Distributed storage Strong cryptography Backup & versioning
Thank you!Questions? Thomas FuhrmannCS VIII – Network ArchitecturesTechnical University Munich, Germany IBDS System ArchitectureUniversity of Karlsruhe, Germanyfuhrmann@net.in.tum.de