Contents

Contents – Large-Scale Distributed Systems Peer-to-Peer Data Mining DataMiningGrid project QosCosGrid project Distributed runtime for multithreaded Java Distributed Model Checking

Large-Scale Distributed Data Mining [SIGMOD’01, ICDM’03-1, ICDM’03-2, SDM’04, CCGRID’04, HPDC’04, KDD’04, DCOSS’05] MSR HPC visit

Data Mining Applications for P2P • Current technology trend allows customers to collect huge amounts of data • e-economy, cheap storage, high-speed connectivity • P2P technology enables customers to share data Customer data mining Mirroring corporates' Recommendations (unbiased) – e-Mule Product Lifetime Cost (as opposed to CLV) MSR HPC visit

Data Mining a P2P Database • Impossible to collect the data • Privacy, Size, Processing power •  Distributed algorithms •  In network processing, push pcssing to peers • Internet scale • NO global operators, NO synchronizations • NO global communication • Ever changing data and system • Failures, crashes, joins, departures • Data modified faster than propagated • Incremental algorithms • Ad-hoc, anytime results A P2P Database: (Jan 2004) 60M users 5M simultaneously connected 45M downloads/month 900M shared files MSR HPC visit

How Not to Data Mine P2P • Many data mining algorithms use decomposable statistics (Avg, Var, cross-tables, etc.) • Global statistics can be calculated using (distributed) sum reduction … or can they? • Synchronization • Bandwidth requirements • Failures • Consistency • Literature: old-style parallelism • rigid pattern, not data-adaptive • no failure tolerance • limited scalability • one-time computation MSR HPC visit

An (immature) approach:Gossip, Sampling • Inaccurate • Hard to decompose • Hard to employ in iterative data mining alg’s • Assume global knowledge (N, mixing times, …) • May work eventually • Work in progress MSR HPC visit

Successful approach:Local Algorithms • Every peer's result depends on the data gathered from a (small) environment of peers • Size of environment may depend on the problem/instance at hand • Eventual correctness guaranteed (assuming stabilization) MSR HPC visit

Properties of Local Algorithms •  Scalability •  Robustness •  Incrementality •  Energy-efficient •  Asynchronousity MSR HPC visit

What is a model? Incremental ad hoc view of data •  Scalability •  Robustness •  Incrementality •  Energy-efficient •  Asynchronousity MSR HPC visit

Local Majority Voting Y W Z X [Wolff & Schuster – ICDM'03. Requires a spanning tree] MSR HPC visit

MSR HPC visit

Locality 1,600 nodes All initiated at once Local DB of 10K transactions Locked step Run until there are no further messages MSR HPC visit

Dynamic Behavior – 1M-peers • 1% noise • At every simulator step 48% set input bits 0.1% noise MSR HPC visit

y x x+y Local Majority Voting Variations – Private Votes [CCGrid’04, HPDC'04, KDD’04] • K-Privacy • Oblivious Counters • Homomorphic encryption MSR HPC visit

A Decomposition Methodology • Decompose data mining process into primitives • Primitives are simpler • Find local distributed algorithms for primitives • Efficient in the “common case” • Recompose data mining process from primitives • Maintain correctness • Maintain locality • Maintain asynchronousity Results:Association Rules [ICDM’03, IEEE Transactions on System, Man, Cybernetics, Part B], Hill Climbing [DCOSS’05], Facility Location [Journal of Grid Computing]. MSR HPC visit

Local Majority and Associations • is frequent if more than MinFreq of the transactions contain X; same for Y • is confident if more than MinConf of the transactions that contain X also contain Y MSR HPC visit

Local Majority and Associations • Find that is frequent Apples 80% MSR HPC visit

Local Majority and Associations • Bananas 80% • Find that is frequent • And that is frequent MSR HPC visit

Local Majority and Associations • ApplesBananas 75% • Find that is frequent • And that is frequent • Then compare the frequencies of and • Three votings! MSR HPC visit

y x Three scans of the local databases required for Privately reaching 90% accuracy! [HPDC’04, SIGKDD’04] Two scans required for the ‘honest’ algorithm [CCGRID’04]. Only a single scan in the ‘plain’ algorithm [ICDM’03]. x+y Performance MSR HPC visit

The Facility Location Problem • Large network of motion sensors log data • Second tier high powered relays assist in data collection • Both resources limited • Which relays to use? MSR HPC visit

Facility Location Problem • Choose which locations to activate • Minimizing sum of costs • Distance based • Data based • Dynamic MSR HPC visit

Eighties Classics Rock R.E.M. King Crimson Alanis Morissette Eighties& Classics Eighties& Rock Rock& Classics Beethoven King Crimson Kate Bush Eighties& Rock& Classics Sting Dire-Straits Facility Location Problem:Cross Domain Clustering • e-Mule users, each having a music database • Search can improve if users are grouped by their preferences – clustering • Natural to talk of preference in terms of music ontology – Country and Soul, Rock and 80's, Classics and Jazz, etc. • Ontology provides important background knowledge MSR HPC visit

Eighties Classics Rock R.E.M. King Crimson Alanis Morissette Eighties& Classics Eighties& Rock Rock& Classics Beethoven King Crimson Kate Bush Eighties& Rock& Classics Sting Dire-Straits Facility Locationand P2P search • Choose representations • Minimizing sum of costs • Discrepancies based • Private • Dynamic Given a name of a song, search only those peers that show interest in the song’s category MSR HPC visit

Dynamic Experiments • Switch databases among peers • at average edge delay • 98% accuracy retained MSR HPC visit

Theoretical Foundations • Veracity Radius [PODC’06, DISC’06] Correct result Wrong result MSR HPC visit

Transparent and Portable Scalable Distributed Java [Clusters’03, OOPSLA’04] MSR HPC visit

JavaSplit: Highly Available Distributed Java • A runtime over distributed environments executing standard multithreaded Java. • Transparent – the programmer is completely unaware of underlying distribution. • Scalable – suitable for very large (wide-area) clusters. • Portable/Heterogeneous– consists of different kinds of machines with different operating systems and various JVM brands. • Fault tolerant – preserves execution correctness despite node crashes. • Achieved through bytecode instrumentation. MSR HPC visit

Bytecode Rewriting Replication-based DSM Module Fault Tolerant Distributed Java Application Bytecode Rewriter Multithreaded Java Application Thread Checkpointing Module Speculative Lock Acquisition Module MSR HPC visit

JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM JVM JavaSplit Overview IP-based network Application Bytecode • Rewriting intercepts synchronization, accesses to shared data etc. • Threads and objects are distributed among the machines Runtime Modules Bytecode 0a0b0c0d 0d262431 c4d68269 2a0b0c0d c1d68866 2a0b0c0d 0h134514 72652723 Instrumented Bytecode Bytecode Rewriter 0c622431 0a0b0c0d b6d68361 2a0b0c0d MSR HPC visit

Benefits of Bytecode Instrumentation Approach • Cross-platform portability • Each node can locally optimize its JVM • Can use Just-in-Time compiler (JIT) • No need to create and install a specialized JVM • The result is a distributedJVM built on top of standard JVMs. MSR HPC visit

Fault Tolerance – Piggyback on DSM/JMM protocol • Objects and thread snapshots are replicated. • In volatile memory • No special stable storage hardware is needed • Participating nodes are divided into groups. • Each group maintains a set of threads and objects despite failures • High availability of computation • Non-failing nodes continue computation during failures and recovery. • Target Benchmark: Java Business Benchmark MSR HPC visit

No speculation Reply received, write notices applied, acquire is completed successfully Acquire(L) Acquirer Waits for reply… Transfer L with write notices Home Forward to owner Owner Lock is released, send write notices MSR HPC visit

Successful speculation Speculative acquire is completed successfully (no rollback) Acquire(L) Acquirer Continues running speculatively (immediately) Transfer L with write notices Home Forward to owner Owner Lock is released, send write notices A round trip delay is saved for each lock acquisitionwhich followed this scenario. MSR HPC visit

Rollback A read-write conflict occurred. Rollback. Acquire(L) Acquirer Rollback Continues running speculatively (immediately) Transfer L with write notices Home Forward to owner Owner Lock is released, send write notices When a conflict occurs, a rollback is required.The snapshot which is used is the already-taken snapshot ofthe fault tolerance feature. MSR HPC visit

DataMiningGrid EC FP6 Collaboration with: U. Ulster, Fraunhoffer, Daimler Craisler U. Ljubljana Status: Packaged. Open Source http://sourceforge.net/projects/datamininggrid/ MSR HPC visit

Architecture MSR HPC visit

Simple Scenario MSR HPC visit

Web Application Enabler MSR HPC visit

Generic Job Template MSR HPC visit

MSR HPC visit

Scope of Application • The DataMiningGrid system is able to accommodate data mining applications from a wide range of platforms, technologies, application domains and sectors. Weka4WS, for example, is restricted to Weka data mining applications. MSR HPC visit

SOA • The DataMiningGrid system is designed around SOA principles. The system implements many independent services, each capable of carrying out a set of predefined tasks. The integration of all these services into one system with a variety of general and designated clients, results in a highly modular, reusable, interoperable, scalable and maintainable system. MSR HPC visit

QosCosGridEC FP6 project Quasi-Opportunistic Supercomputing for Complex Systems in Grid Environments Publications: Submitted to ICPP’07 Status: 2.5 years project. Started Q4 2006

QosCosGrid MSR HPC visit

Objectives • Exploit available/“opportunistic” grid resources and provide a computationally equivalent to a supercomputer service. • Provide the means for users to develop complex systems applications/simulations with supercomputing requirements. • With the functionality for end to end provisioning, the simulation will perform better than a pure opportunistic grid approach. • Technion roles • Develop an economic model to motivate resource sharing. • Prove it to be stable even under irrational behaviour • Develop negotiation, SLA, and reservation infrastructure • Develop allocation/scheduling algorithms that • match multiplicity of resources • take bandwidth requirements into account • take other requirements into account MSR HPC visit

Grid Requirements • Technology facilitating dynamic selection of the available grid resources suitable for CS simulations encompassing the large variety of domains in which complex systems are researched • Functionalities such as reservation of resources, synchronization and routing communication -- (end to end provisioning) • Highly dependable grid architecture capable of tolerating system instabilities and of treatment of non-dedicated resources in the context of complex systems simulations MSR HPC visit

Complex Systems are Dynamic • A system is complex if it is characterized by multiple interactions between many different components. • Complex systems are Systems that constantly evolve in time according to a set of rules. • It is characterized by evolution such that it may be highly sensitive to the initial conditions and small perturbations • The rules are usually non linear making it difficult to understand and verify Emergent Behaviour Self Organisation Evolution Adaptation MSR HPC visit

Clusters QosCosGrid MSR HPC visit

Scalable Distributed Model Checking [CAV’00, CAV’01, CAV’03, FMCAD’04, CHARME’05, ATVA’05 Best Paper Award] Production tool, Intel MSR HPC visit

Contents – Large-Scale Distributed Systems

Contents – Large-Scale Distributed Systems

Presentation Transcript

Large Scale Distributed Information Systems Lab at University of Georgia http://lsdis.cs.uga.edu

Large-scale Incremental Processing Using Distributed Transactions and Notifications

DISTRIBUTED HASH TABLES Building large-scale, robust distributed applications

Distributed Processing and Large-Scale System Engineering for AGI

DISTRIBUTED HASH TABLES Building large-scale, robust distributed applications

Chapter 18 – Distributed software engineering

A Distributed Framework for Computation on the Results of Large Scale NLP

Designing, programming, and verifying (large scale) distributed systems

Large-scale adaptive systems

Adaptive & Reflective Middleware for Large-scale Distributed Real-time & Embedded Systems

Large-scale Deployment in P2P Experiments Using the JXTA Distributed Framework

Advances in Large-scale Distributed, Real-time, & Embedded Systems

A Large-Scale Network Testbed

Section 7: Introducing large scale systems

A Generic Architecture for Large-Scale Distributed Simulations

Chapter 18 – Distributed software engineering

netShip: A Networked Virtual Platform for Large-Scale Heterogeneous Distributed Embedded Systems

1. Introduction II

Workshop on Parallel and Distributed Real-Time Systems 2005

Jorge Cardoso and Amit Sheth Large Scale Distributed Information Systems (LSDIS) Lab,

Contents – Large-Scale Distributed Systems

Contents – Large-Scale Distributed Systems

Presentation Transcript

Large Scale Distributed Information Systems Lab at University of Georgia http://lsdis.cs.uga.edu

Large-scale Incremental Processing Using Distributed Transactions and Notifications

DISTRIBUTED HASH TABLES Building large-scale, robust distributed applications

Distributed Processing and Large-Scale System Engineering for AGI

DISTRIBUTED HASH TABLES Building large-scale, robust distributed applications

Chapter 18 – Distributed software engineering

A Distributed Framework for Computation on the Results of Large Scale NLP

Designing, programming, and verifying (large scale) distributed systems

Large-scale adaptive systems

Adaptive &amp; Reflective Middleware for Large-scale Distributed Real-time &amp; Embedded Systems

Contents

Large-scale Deployment in P2P Experiments Using the JXTA Distributed Framework

Advances in Large-scale Distributed, Real-time, &amp; Embedded Systems

A Large-Scale Network Testbed

Section 7: Introducing large scale systems

A Generic Architecture for Large-Scale Distributed Simulations

Chapter 18 – Distributed software engineering

netShip: A Networked Virtual Platform for Large-Scale Heterogeneous Distributed Embedded Systems

1. Introduction II

Workshop on Parallel and Distributed Real-Time Systems 2005

Jorge Cardoso and Amit Sheth Large Scale Distributed Information Systems (LSDIS) Lab,

Adaptive & Reflective Middleware for Large-scale Distributed Real-time & Embedded Systems

Advances in Large-scale Distributed, Real-time, & Embedded Systems