1 / 26

Fall 2008 CS 668 Parallel Computing

Fall 2008 CS 668 Parallel Computing. Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807. Lecture 1: Welcome. Goals of this course Syllabus, policies, grading Blackboard Resources LINC Linux cluster Introduction/Motivation for HPPC

gdelaney
Download Presentation

Fall 2008 CS 668 Parallel Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fall 2008CS 668Parallel Computing Prof. Fred Annexstein fred.annexstein@uc.edu Office Hours: 11-1 MW or by appointment Tel: 513-556-1807

  2. Lecture 1: Welcome • Goals of this course • Syllabus, policies, grading • Blackboard Resources • LINC Linux cluster • Introduction/Motivation for HPPC • Scope of the Problems in Parallel Computing

  3. Goals • Primary: • Provide an introduction to the computing systems, programming approaches, common numerical and algorithmic methods used for high performance parallel computing • Secondary: • Have an course meeting competency requirements of RRSCS • Provide hands-on parallel programming experience

  4. Official Syllabus Available on Blackboard • Textbook Parallel Programming in C with MPI and OpenMP, Michael J. Quinn Other Recommended Texts • Parallel Programming With Mpi, Peter Pacheco • Introduction to Parallel Computing: Design and Analysis of Algorithms: Ananth Grama, Anshul Gupta, George Karpis, Vipin Kumar - Using MPI - 2nd Edition: Portable Parallel Programming with the Message Passing Interface by William Gropp

  5. Workload/Grading • Exams (1 or 2) • Graded 30% of Grade • Written exercises (3-4) • May/may not be graded • Programming Assignments (3-4) • May be done in groups of at most 2 • MPI programming, performance measurement • Research papers (1) • Discussion research questions, strengths, weaknesses, interesting points, contemporary bibliography • Final project (1) • Individual or Group programming project and report

  6. Policies • Missed Exams: • Missed exams can not be made up unless pre-approved. Please see the instructor as soon as possible in the event of a conflict. • Academic Honesty: • Plagiarism on assignments, quizzes or exams will not be tolerated. See your student code of conduct (http://www.uc.edu/conduct/Code_of_Conduct.html) for more on the consequences of academic misconduct. There are no “small” offenses.

  7. Blackboard • Syllabus and my contact info • Announcements • Lecture slides • Assignment handouts • Web resources relevant to the course • Discussion board • Grades

  8. What is the Ralph Regula School? • The Ralph Regula School of Computational Science is a statewide, virtual school focused on computational science. It is a collaborative effort of the Ohio Board of Regents, Ohio Supercomputer Center, Ohio Learning Network and Ohio's colleges and universities. With funding from NSF, the school acts as a coordinating entity for a variety of computational science education activities aimed at making education in computational science available to students across Ohio, as well as to workers seeking continuing education about this technology. • Website: http://www.rrscs.org

  9. CS LINC Cluster • Michal Kouril’s links • http://www.ececs.uc.edu/~kourilm/clusters/ • See README file for instructions on running MPI code on beowulf.linc.uc.edu • Accounts • ECE/CS students should already have an account • I can request accounts for the non-ECE/CS students • Access • Remote access only, the cluster is in the ECE/CS server/machine room on the 8th floor of Rhodes, visible through windows in the 890’s hallway

  10. Why HPPC? • Who needs a roomful of computers anyway? • My PC and XBOX run at GFLOP rates (Billion Floating Point Operations per second) NCSA TeraGrid IA-64 Linux Cluster (http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/TGIA64LinuxCluster/)

  11. Needed by People who solve Science and Engineering problems • Materials / Superconductivity • Fluid Flow • Weather/Climate • Structural Deformation • Genetics / Protein interactions • Seismic Many Research Projects in Natural Sciences and Engineering cannot exist without HPPC

  12. Applications • Videos – Applications in Physics and Geology • Simulation of Large-Scale Structure of Universe http://www.youtube.com/watch?v=8C_dnP2fvxk • Stability Simulation – http://www.youtube.com/watch?v=ZCMiLJOXrpc • Super Volcano Movie - Show first 1:00 minute http://www.youtube.com/watch?v=unGODG7N1Bs

  13. Why are the problems so large? • 3-Dimensional • If you want to increase the level of resolution by factor of 10, problem size increases by 103 • Many Length Scales (both time and space) • If you want to observe the interactions between very small local phenomenon and larger more global phenomenon • The number of relationships between data items grows quadraticly. • Example: human genome 3.2 G base pairs means about 5,000,000,000,000,000,000=5E relations

  14. How can you solve these problems? • Take advantage of parallelism • Large problems generally have many operations which can be performed concurrently • Parallelism can be exploited at many levels by the computer hardware • Within the CPU core, multiple functional units, pipelining • Within the Chip, many cores • On a node, multiple chips • In a system, many nodes

  15. However…. • Parallelism has overheads • At the core and chip level the cost is complexity and money • Most applications get only a fraction of peak performance (10%-20%) • At the chip and node level, memory bus can get saturated if too many cores • Between nodes, the communication infrastructure is typically much slower than the CPU

  16. Necessity Yields Modest Success • Power of CPUs keeps growing exponentially • Parallel programming environments changing very slowly – much harder than sequential Two standards have emerged • MPI library, for processes that do not share memory • OpenMP directives, for processes that do share memory

  17. Why MPI? • MPI = “Message Passing Interface” • Standard specification for message-passing libraries • Very Portable • Libraries available on virtually all parallel computers • Free libraries also available for networks of workstations or commodity clusters

  18. Why OpenMP? • OpenMP an application programming interface (API) for shared-memory systems • Based on model of creating and scheduling multi-threaded computations. • Supports higher performance parallel programming of symmetrical multiprocessors

  19. What are the Costs? Commercial Parallel Systems • Relatively costly per processor • Primitive programming environments • Scientists looked for alternative Beowulf Concept circa 1994 • NASA project (written by Sterling and Becker) • Commodity processors • Commodity interconnect • Linux operating system • Message Passing Interface (MPI) library • High performance/$ for certain applications

  20. How are they Programmed? Task Dependence Graph • Begin with Directed graph • Vertices = tasks Edges = dependences • Edges are removed as tasks complete Data Parallelism • Independent tasks apply same operation to different elements of a data set Functional Parallelism • Independent tasks apply different operations to different data elements Pipelining • Divide a process into stages • Produce and consume several items simultaneously

  21. Why not just use a Compiler? • Parallelizing compiler - Detect parallelism in sequential program • Produce parallel executable program Advantages Can leverage millions of lines of existing serial programs • Saves time and labor- Requires no retraining of programmers • Sequential programming easier than parallel programming Disadvantages • Parallelism may be irretrievably lost when programs written in sequential languages • Simple example: Compute all partial sums in an array • Performance of parallelizing compilers on broad range of applications still up in air

  22. Can we Extend Existing Languages? Programmer can give directives or clues to the complier about how to parallelize Advantages • Easiest, quickest, and least expensive • Allows existing compiler technology to be leveraged • New libraries can be ready soon after new parallel computers are available Disadvantages • Lack of compiler support to catch errors • Easy to write programs that are difficult to debug

  23. Or Create New Parallel Languages? Advantages • Allows programmer to communicate parallelism to compiler directly • Improves probability that executable will achieve high performance Disadvantages • Requires development of new compilers • New languages may not become standards • Programmer resistance

  24. Where are we in 2008? • Performance makes Low-level approaches popular • Augment existing language with low-level parallel constructs and directives • MPI and OpenMP are prime examples Advantages • Efficiency • Portability Disadvantages • More difficult to program and debug

  25. Programming Assignment #1 • Log into beowulf.linc.uc.edu and run some simple sample programs.

  26. Reading Assignment #1 on Blackboard

More Related