120 likes | 213 Views
Navigating and Sharing in a Decentralized World. Francisco Matias Cuenca-Acuna http://www.panic-lab.rutgers.edu/. People. Graduate students Christopher Peery Konstantinos Kleisouris Faculty Richard Martin Thu Nguyen. Federated computing.
E N D
Navigating and Sharing in a Decentralized World Francisco Matias Cuenca-Acuna http://www.panic-lab.rutgers.edu/
People • Graduate students • Christopher Peery • Konstantinos Kleisouris • Faculty • Richard Martin • Thu Nguyen
Federated computing • Current trend toward ubiquitous Internet connectivity is driving a new model of federated computing • Computing systems that are geographically distributed and may span multiple organizations • Concurrently, deep penetration of computer usage • 500 million PCs in operation worldwide (1 for every 12 people) • 80% of them are in desktops • 40% annual growth • 600 million Internet users worldwide • Federated computing appearing at every level • Social group-based sharing • P2P: Gnutella, KaZaA, DirectConnect • Web-based: Ebay, Google groups, Yahoo groups, DMOZ • Scientific computing • Many emerging research grids: http://www.gpds.org/ • Business-to-business ecommerce Source http://news.com.com/2100-1040-940713.html
The challenge • Federated computing provides the opportunity to harness vast amount of resources • Consider just data sharing • Users produce 740TB of information per year • Information per person is growing continuously • 80% annual growth on total disk capacity sold per year • Emergence of huge distributed data repositories • Local community of 3000 undergraduate students sharing 20TB of data • WWW: Google had indexed 1 billion pages (20TB of content) by 2000 • The European Data Grid has only 100’s of nodes but PB’s of data • Challenge: how to manage and actually use these resources • Decentralized control • Widely distributed • Heterogeneous components Source http://www.sims.berkeley.edu/research/projects/how-much-info/
The PlanetP Project • Information and resource management for networked communities • Data sharing • Provide content-based access & ranking of results • Allow user to cooperatively organize data • Provide predictable data availability • Deployment, monitoring, and management of federated services • Provide a common runtime environment • Distributively follow sysadmin guidelines for service deployment • Example: UDDI naming service for web services • Dealing with Decentralization • Self-management & self-configuration • Autonomous cooperation • Loosely synchronized global information • Randomized algorithms
Current state of the project Service management Global namespace and storage management Automatic replication for availability Content indexing and ranking Data propagation
Current state of the project • Based on epidemic communication • Very resilient to node/network failures • Membership management • Every node has a loosely synchronized view of the community Service management Global namespace and storage management Automatic replication for availability Content indexing and ranking Data propagation
Current state of the project • Distributed information ranking algorithm • Allows search engine like queries • 2 step search & rank to deal with outdated information Service management Global namespace and storage management Automatic replication for availability Content indexing and ranking Data propagation
Current state of the project • Allow users to specify data availability • Present a probabilistic availability model • Monitor availability as community changes Service management Global namespace and storage management Automatic replication for availability Content indexing and ranking Data propagation
Work in progress • File system interface over communal content • Unlike the Web the namespace is writeable • Dynamic namespace management • Automated local storage management • Remove content if we can recover it • Hoarding for disconnected operation Service management Global namespace and storage management Automatic replication for availability Content indexing and ranking Data propagation
Work in progress • Distributed runtime for Web Services • Administrators just dictate the policy • They reason about • capacity • availability • privacy issues • Provide self deployment and monitoring Service management Global namespace and storage management Automatic replication for availability Content indexing and ranking Data propagation
The PlanetP Project http://www.panic-lab.rutgers.edu/Questions?