410 likes | 505 Views
Neil Skrypuch COSC 3P93 3/21/2007. Highly Distributed Parallel Computing. Overview. a network of computers all working towards a similar goal network consists of many nodes, few servers nodes perform computing and send results to a server servers distribute jobs
E N D
Neil Skrypuch COSC 3P93 3/21/2007 Highly DistributedParallel Computing
Overview • a network of computers all working towards a similar goal • network consists of many nodes, few servers • nodes perform computing and send results to a server • servers distribute jobs • node machines do not communicate with eachother
Relatively Simple • don't need to worry about special interconnections • don't need to worry about cluster booting
Non-Homogeneous Network • can work across different computer architectures, OSes, etc • computers can be of varying speeds • doesn't require the fastest or most expensive computers • computers can be distributed anywhere in the world
Infrastructure • infrastructure for HDPC already exists almost everywhere • anyone with a network of computers is already ready for HDPC • lots of programs already exist that take advantage of HDPC
Expansion • expansion is painless • there are no special constraints on the “shape” of the network • not fast enough yet? keep adding more computers until it is
Resilience to Failure • it doesn't matter if one or more nodes die • only the reliability of the central server(s) matter
Suitability • not all problems are suited to HDPC • highly communication bound problems are a poor fit for HDPC
Server Dependence • central server dependence is a double edged sword • if the central server becomes unavailable, everything grinds to a halt
Network (In)security • how to verify if a client should be allowed to join the network? • protecting data sent over the network • verifying integrity and authenticity of data sent over the network
Network (Un)reliability • nodes temporarily losing connectivity may make them temporarily useless
Server Dependence • the central server need not be a single server • server itself may be clustered • countless ways to cluster servers
Clustering With a Database • allow nodes to talk directly to the database • cluster the database over multiple servers • multi-master replication • single master replication • lots more...
Server Hierarchy • multiple tiers of servers may also be used • could be considered recursive HDPC • very similar to the tree architecture of supercomputers
Lost Nodes • define a maximum amount of time to wait for a node's response • use redundancy • assume some nodes will always be lost • send duplicate jobs to multiple nodes simultaneously
Network (In)security • not as big of an issue as one might think • encryption and public key infrastructures mitigate most confidentiality and authenticity concerns • redundancy is useful for both reliability and security
Work Buffering • taking larger portions of work at a time • temporary connectivity issues pose less of a problem this way • a node can continue working without talking to a central server for longer
Combinatorics • search • enumeration • generation
Cryptography • brute force cipher cracking • gives a glimpse of the future, in terms of what the average person will be able to crack
Artificial Intelligence • genetic algorithms • genetic programming • alpha-beta search
Graphics • ray tracing • animation • fractal generation and calculation
Simulation • weather and climate modeling • particle physics
Guidelines for Suitability • most problems involving a large search tree are well suited to HDPC • anything that can be broken down into smaller, self-contained, chunks is a good candidate for HDPC
Folding@Home • ~200,000 non-dedicated nodes • 240 TFLOPS • approximately 40 central servers, unknown speeds
SETI@Home • ~200,000 non-dedicated nodes • 288 TFLOPS • 10 central servers, all relatively modest
Blue Gene/L • currently the fastest supercomputer • not HDPC • 65,536 dedicated nodes • 280 TFLOPS • cost about $100,000,000 US
HDPC Works Well • typical speedup is close to linear • cost is substantially less than a comparable supercomputer • nodes can also be general purpose computers
Infrastructure Reuse • in general, new hardware investments are not necessary • creating new infrastructure is expensive and time consuming • it's easy to justify using things you already have for additional purposes • there are tons of idle CPUs at any given time, why not use them?
Low Barrier to Entry • anyone with a couple of networked computers can start experimenting
Painlessly Scalable • smooth curve upwards for both cost and performance
Simpler to Program • doesn't require as much “thinking in parallel” in comparison to other approaches • thinking in parallel is hard and fundamentally different than thinking serially • pushes the heavy lifting onto the database instead of the application programmer
Commodity Hardware is Fast • a typical desktop machine today is more powerful than a supercomputer from 15 years ago • and costs orders of magnitude less • and outputs much less heat • and takes up much less space • and consumes much less power
The Future • supercomputers will become faster • HDPC will become even faster than supercomputers • as both number of computers and speed increases • both supercomputers and HDPC will fill their own separate niche
References • http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats • http://www.boincstats.com/stats/project_graph.php?pr=sah • http://www.boincstats.com/stats/project_graph.php?pr=bo • http://www.itjungle.com/tlb/tlb033004-story04.html • http://setiathome.berkeley.edu/sah_status.html • http://fah-web.stanford.edu/serverstat.html • http://top500.org/list/2006/11/100