1 / 6

Data Mining in Ubiquitous Distributed Environments

Data Mining in Ubiquitous Distributed Environments. Assaf Schuster Technion. Purpose of this Tutorial. Convergence of distributed systems and data mining Evolving field, no systematic coverage of all aspects

eara
Download Presentation

Data Mining in Ubiquitous Distributed Environments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining in Ubiquitous Distributed Environments Assaf Schuster Technion SEBD Tutorial, June 06

  2. Purpose of this Tutorial • Convergence of distributed systems and data mining • Evolving field, no systematic coverage of all aspects • Will present: issues, challenges, examples for algorithmic approaches, ideas, tradeoffs accuracy vs. overhead • Will not present: formal treatment, proofs, details, technology, systems, hardware… SEBD Tutorial, June 06

  3. Ubiquitous Computing Systems • Various Systems: Grid, P2P, WSN, MANET • Several similar technological aspects • Scale, aim for at least 10K (10M in P2P) • partial failure, heterogeneity, dynamic state / data • Multi-user, a 10K system serves >= 1K users • resource sharing, caching, consistency • Lots of distributed data • streams, incremental, anytime, local filtering, locality filtering • Cooperation of self-motivated parties • trust management, security, privacy, competitive market, self vs. global optimizations • Stringently resource limited • in-network computing, storage distribution • Non-similar technological aspects SEBD Tutorial, June 06

  4. Ubiquitous Data Mining • For the community • E.g., P2P recommendations based on e-interaction • For Security • E.g., identify and avert DoS attack (Overpeer and P2P poisoning) • For Administration • E.g., misconfiguration detection system (DataMiningGrid demo) • For Data Cleansing • E.g., in-network outliers detection (and removal) in WSN • DM Using HPC • E.g., idle-cycle batch systems for high-complexity analysis tasks (Superlink-Online) SEBD Tutorial, June 06

  5. Technological Challenges: Algorithms • Scalable and resource limited distributed DM • Algorithms for 10K peers, algorithms limited to two messages per peer per hour, synchronization-less, iteration-less, bag-of-tasks, dynamic divisibility, etc. • Monitoring • Distributed, local filtering • Success, Correctness, and Consistency • Partial failure, message dropping, heterogeneity, etc. can yield all sorts of trouble • Reusability, incrementality • E.g., multi-class classifiers, multi-metric k-means clustering, etc. SEBD Tutorial, June 06

  6. Technological Challenges: Systems • Exploitation & HCI • Lay user (parameterless) DM, interactive DM • DM-based autonomous ubiquitous systems • Security, Fraud, and Privacy • Authorization, public-key-infrastructure, trust management, data polution • Longevity of DM jobs • Resource sharing, non dedicated resources • Communication patterns • Esp. reliability and addressability. Are these problems best solved by suitable algorithms? SEBD Tutorial, June 06

More Related