220 likes | 435 Views
Concept and Rationale. The ideaSplit your program into bits that can be executed simultaneouslyMotivationSpeed, Speed, Speed at a cost effective priceIf we didn't want it to go faster we would not be bothered with the hassles of parallel programming!Reduce the time to solution to acceptable le
E N D
1. COMP4300/COMP6430Parallel Systems2011 Richard Brent and Alistair Rendell
School of Computer Science
Australian National University
2. Concept and Rationale The idea
Split your program into bits that can be executed simultaneously
Motivation
Speed, Speed, Speed… at a cost effective price
If we didn’t want it to go faster we would not be bothered with the hassles of parallel programming!
Reduce the time to solution to acceptable levels
No point waiting 1 week for tomorrow’s weather forecast
Simulations that take months to run are not useful in a design environment
3. Sample Application Areas Fluid flow problems
Weather forecasting/climate modeling
Aerodynamic modeling of cars, planes, rockets etc
Structural Mechanics
Building bridge, car, etc strength analysis
Car crash simulation
Speech and character recognition, image processing
Visualization, virtual reality
Semiconductor design, simulation of new chips
Structural biology, molecular level design of drugs
Human genome mapping
Financial market analysis and simulation
Datamining, machine learning
Games programming!
4. World Climate Modeling Atmosphere divided into 3D regions or cells
Complex mathematical equations describe conditions in each cell, eg pressure, temperature, velocity
Conditions change according to neighbour cells
Updates repeated frequently as time passes
Cells are affected by more distant cells the longer range the forecast
Assume
Cells are 1x1x1 mile to a height of 10 miles, 5x108 cells
200 flops to update each cell per timestep
10 minute timesteps for total of 10 days
100 days on 100 mflop machine
10 minutes on a tflop machine
5. ParallelSystems@ANU: NCI/NF NCI: National Computational Infrastructure/National Facility
http://nci.org.au
http://nf.nci.org.au
History
Establishment of APAC in 1998 with $19.5M grant from federal government, renewed in 2004 with a grant of about $29M
programs in grid services, education and technology diffusion
ANU currently hosts to a 1492 node Sun X6275 Constellation Cluster, each node has two quad core 2.93GHz Intel Nehalem CPUs giving a total of 11936 cores. The interconnect is QDR InfiniBand
6. ParallelSystems@DCS Bunyip: tsg.anu.edu.au/Projects/Bunyip
192 processor PC Cluster
winner of 2000 Gordon Bell prize for best price performance
9. Parallelisation Split program up and run parts simultaneously on different processors
On N computers the time to solution should (ideally!) be 1/N
Parallel Programming: the art of writing the parallel code!
Parallel Computer: the hardware on which we run our parallel code!
COMP4300 will discuss both
Beyond raw compute power other motivations include
Enabling more accurate simulations in the same time (finer grids)
Providing access to huge aggregate memories
Providing more and/or better input/output capacity
10. Parallelism in a Single “CPU” Box Multiple instruction units:
Typical processors issue ~4 instructions per cycle
Instruction Pipelining:
Complicated operations are broken into simple operations that can be overlapped
Graphics Engines:
Use multiple rendering pipes and processing elments to render millions of polygons a second
Interleaved Memory:
Multiple paths to memory that can be used at same time
Input/Output:
Disks are stripped with different blocks of data written to different disks at the same time
11. Big Parallel Systems! www.top500.org
12. Health Warning! Course is run every other year
Drop out this year and it won’t be repeated until 2013
It’s a 4000/6000 level course, it’s supposed to:
Be more challenging that a 3000 level course!
Be less well structured
Have a greater expectation on you
Have more student participation
Be fun!
13. Learning Objectives Parallel Architecture:
Basic issues concerning design and likely performance of parallel systems
Specific Systems:
Will make extensive use of NCI facilities
Programming Paradigms:
Distributed and shared memory, things in between, data intensive computing
Parallel Algorithms:
Numeric and non-numeric
The future
14. Commitment and Assessment The pieces
2 lectures per week (30 core lecture hours)
6 Labs (not marked, solutions provided)
2 assignments (40%)
1 mid-semester exam (~2 hours, 20%)
1 final exam (3 hours, 40%)
Final mark is sum of assignment, mid-semester and final exam mark
15. Lectures Two slots
Mon 14:00-16:00 ENGN T
Tue 15:00-16:00 ENGN T
Exact schedule on web site
Partial notes will be posted on the web site
bring copy to lecture
Attendance at lectures and labs is strongly recommended
Attendance at labs will be recorded
16. Course Web Site http://cs.anu.edu.au/student/comp4300
We will use wattle only for lecture recordings
17. Laboratories Start in week 3 (March 7th)
See web page for detailed schedule
2 sessions available
Tue 09:00-11:00 N115/N116
Wed 16:00-18:00 N114
Register via streams now
Not assessed, but will be examined
18. People Course Convener
Richard Brent
2000 Moran Building
Richard.Brent@anu.edu.au
Phone 6125 3873
19. Course Communication Course web page
cs.anu.edu.au/student/comp4300
Bulletin board (forum – available from streams)
cs.anu.edu.au/streams
At lectures and in labs
Email
comp4300@cs.anu.edu.au
In person
Office hours (to be set – see web page)
Email for appointment if you want specific time
20. Useful Books Principles of Parallel Programming, Calvin Lin and Lawrence Snyder, Pearson International Edition, ISBN 978-0-321-54942-6
Introduction to Parallel Computing, 2nd Ed., Grama, Gupta, Karypis, Kumar, Addison-Wesley, ISBN 0201648652 (Electronic version accessible on line from ANU library – search for title)
Parallel Programming: techniques and applications using networked workstations and parallel computers, Barry Wilkinson and Michael Allen. Prentice Hall 2nd edition. ISBN 0131405632.
and others on web page
21. Questions so far!?