1 / 32

Welcome to CSE160: Introduction to Parallel Computing

This course covers parallel programming basics, parallel processing concepts, and performance optimization techniques. Topics include threading, synchronization, memory hierarchies, and more. Prepare for quizzes, exams, and programming assignments.

Download Presentation

Welcome to CSE160: Introduction to Parallel Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Welcome to CSE160! Introduction to parallelcomputation

  2. Reading • Two requiredtexts http://goo.gl/SH98DC • An Introduction to Parallel Programming, byPeterPacheco,MorganKaufmann,2011 • C++ConcurrencyinAction:PracticalMultithreading, • byAnthonyWilliams,ManningPublications,2012 • Lecture slides are no substitute for reading thetexts! • Complete the assigned readings beforeclass • readings→pre-classquizzes→ inclassproblems→exams • Pre-Class Quizzes will be on TritonEd •  In-Class Quizzes: Registeryour clicker on TritonEd today!

  3. Background • Pre-requisite: CSE100 • Comfortable with C/C++programming • If you took Operating Systems (CSE 120), you should be familiar with threads, synchronization,mutexes • If you took Computer Architecture (CSE 141) you should be familiar with memory hierarchies, including caches • We will cover these topics sufficiently to level the playingfield

  4. CourseRequirements • 4-5Programming assignments(45%) • Multhreading with C++11+ performance programming • Assignmentsshallbedoneinteamsof2 • Potential new 5th Assignment done individually • Exams (35%) • 1 Midterm (15%) + Final(20%) • midterm = (final > midterm) ? final :midterm • On-line pre-class quizzes(10%) • Classparticipation • Respond to75%of clicker questionsandyou’ve participated in alecture • No cell phone usage unless previouslyauthorized. • Other devices may be used for note-takingonly

  5. Cell phones?!? Not in class (Sorry)!

  6. Policies • AcademicIntegrity • Do you ownwork • Plagiarism and cheating will not betolerated • You are required to complete an Academic Integrity ScholarshipAgreement • (part ofA0)

  7. ProgrammingLabs • Bangcluster • Ieng6 • Make sure your accountswork (Standby for information) • Software • C++11threads • We will use Gnu4.8.4 • Extensionstudents: • Add CSE 160 to your list ofcourses • https://sdacs.ucsd.edu/~icc/exadd.php

  8. Class presentationtechnique • I will assume that you’ve read the assigned readings beforeclass • Consider the slides as talking points, class discussions driven by yourinterest • Learning is not a passiveprocess • Class participation is important to keep the lectureactive • Different lecturemodalities • The 2 minutepause • In class problemsolving

  9. The 2 minutepause • Opportunity in class to improve your understanding, to make sure you “got” it By trying to explain to someoneelse • Getting your mind actively working onit • Theprocess • I pose aquestion • You discuss with 1-2neighbors • Important Goal: understand why the answer iscorrect • After most seem to bedone • I’ll ask forquiet • A few will share what their group talked about • – Good answers are those where you were wrong, thenrealized… • Or ask aquestion! Please pay attention and quickly return to “lecture mode” so we can keepmoving!

  10. Group Discussion #1 What is yourBackground? • C/C++ Java Fortran? • TLBmisses • Multithreading • MPI • RPC • C++11Async • CUDA, OpenCL,GPUs • Abstract baseclass u0 Dv0 Dt f(a)f(a)(xa)f(a)(xa)2... 1! 2!

  11. The rest of thelecture • Introduction to parallelcomputation

  12. What is parallel processing? • Computeon • simultaneously executing physicalresources • Improve some aspect ofperformance • Reducetimetosolution:multiplecoresarefasterthan1 • Capability:Tacklealargerproblem,moreaccurately • Multiple processor cores co-operate to process a related set of tasks – tightlycoupled • What about distributedprocessing? • Lesstightlycoupled,unreliablecommunicationand computation,changingresourceavailability • Contrast concurrency withparallelism • Correctnessisthe goal,e.g.databasetransactions • Ensurethatsharedresourcesareusedappropriately

  13. Group Discussion#2 Have you written a parallelprogram? • Threads • C++11Async • OpenCL • CUDA • RPC • MPI

  14. Why study parallelcomputation? • Becauseparallelismiseverywhere:cellphones, laptops, automobiles,etc. • Ifyoudon’tparallelism,youloseit! • Processorsgenerallycan’trunatpeakspeedon1core • Manyapplicationsareunderservedbecausetheyfailtouse available resourcesfully • But there are many details affectingperformance • The choice ofalgorithm • Theimplementation • Performancetradeoffs • The courses you’ve taken generally talked about how to do these things on 1 processing coreonly • Lots of changes on multiplecores

  15. How does parallel computing relate to other branches of computerscience? • Parallel processing generalizes problems we encounter on single processorcomputers • A parallel computer is just an extension of the traditional memoryhierarchy • The need to preserve locality, which prevails in virtual memory, cache memory, and registers, also applies to a parallel computer

  16. What you will learn in thisclass • How to solve computationally intensive problems on multicore processors effectively using threads Theory andpractice • Programming techniques,includingperformance programming • Performancetradeoffs,esp.thememoryhierarchy • CSE 160 will build on what you learned earlier in your career about programming, algorithm design andanalysis Scott B. Baden / CSE 160 /Wi '16

  17. The age of the multi-coreprocessor • On-chip parallelcomputer • IBMPower4(2001),Intel,AMD… • First dual core laptops(2005-6) • GPUs(nVidia,ATI):desktop supercomputer • Insmartphones,behindthe dashboard • blog.laptopmag.com/nvidia-tegrak1-unveiled • Everyonehasaparallelcomputerat theirfingertips • realworldtech.com 2233

  18. Why is parallel computationinevitable? • Physicallimitations onheatdissipation prevent further increases in clockspeed • To build a faster processor, we replicate the computationalengine http://www.neowin.net/ ChristopherDyken,SINTEF 24

  19. The anatomy of a multi-coreprocessor • MIMD • Eachcorerunsanindependentinstructionstream • All share the globalmemory • 2 types, depends on uniformity of memory accesstimes • UMA: UniformMemoryAccesstime • AlsocalledaSymmetricMultiprocessor(SMP) • NUMA:Non-UniformMemoryAccesstime 1/5/16 25

  20. Multithreading • Howdoweexplainhowtheprogramrunsonthehardware? • Onsharedmemory,a naturalprogrammingmodeliscalled • multithreading • Programsexecuteasasetofthreads • Threads are usually assigned to different physicalcores • Each thread runs the same code as an independent instructionstream • Same Program Multiple Data programming model =“SPMD” • Threadscommunicateimplicitlythroughsharedmemory(e.g. the heap),buthavetheirownprivatestacks • They coordinate(synchronize) via sharedvariables

  21. What is athread? • A thread is similar to a procedure call with notabledifferences • The control flowchanges • Aprocedurecallis“synchronous;”returnindicates completion • Aspawnedthreadexecutesasynchronouslyuntilit completes, and hence a return doesn’t indicate completion • A new storage class: shareddata • Synchronizationmaybeneededwhenupdatingshared state (threadsafety) Shared memory s =... s y = ..s... Private memory i:8 i:5 i:2 P0 P1 Pn

  22. CLICKERSOUT

  23. Whichofthesestorageclassescanneverbe shared amongthreads? Globals declared outside anyfunction Local automaticstorage Heapstorage Class members(variables) B &C

  24. Whythreads? • Processesare“heavyweight”objectsscheduledbytheOS • Protected address space, open files, and otherstate • AthreadAKAalightweightprocess(LWP) • Threads share the address space and open files of the parent, but have their ownstack • Reduced management overheads, e.g. threadcreation • Kernel scheduler multiplexesthreads heap stack stack . . . P P P

  25. Parallel controlflow • Parallelprogram • Startwith a singlerootthread Fork-join parallelismtocreate concurrentlyexecutingthreads • Threadscommunicateviasharedmemory • A spawned thread executes asynchronously until itcompletes • Threads may or may not execute on differentprocessors Heap(shared) stack Stack (private) . . . P P P

  26. Whatformsofcontrolflowdowehave ina serialprogram? FunctionCall Iteration Conditionals(if-then-else) Switchstatements All of theabove

  27. Multithreading inPractice • • C++11 • POSIX Threads “standard” (pthreads): IEEE POSIX1003.1c-1995 • Low levelinterface • Beware of non-standardfeatures • OpenMP – programannotations • Java threads not used in high performance computation • Parallel programminglanguages • Co-arrayFORTRAN • UPC

  28. C++11Threads • Via <thread>, C++ supports a threading interface similar to pthreads, though a bit more userfriendly • Async is a higher level interface suitable for certain kinds ofapplications • New memorymodel • Atomictemplate • Requires C++11 compliant compiler, gnu 4.7+,etc.

  29. Hello world with<Threads> #include <thread> void Hello(int TID){ cout << "Hello from thread "<< } $ ./hello_th3 Hello from thread 0 Hello from thread1 Hello from thread2 $ ./hello_th3 Hello from thread 1 Hello from thread 0 Hello from thread2 $ ./hello_th4 Running with 4 threads Hello from thread0 Hello from thread3 Hello from thread Hello from thread21 TID << endl; int main(int argc, char *argv[]){ thread *thrds = newthread[NT]; // Spawnthreads for(int t=0;t<NT;t++){ thrds[t] = thread(Hello, t); } // Jointhreads for(int t=0;t<NT;t++) thrds[t].join(); }

  30. Stepsinwritingmultithreaded code • Wewriteathreadfunctionthatgetscalledeachtimewe spawn a newthread • SpawnthreadsbyconstructingobjectsofclassThread (in the C++library) • Eachthreadrunsonaseparateprocessingcore • (Ifmorethreadsthancores,the threadssharecores) • Threadssharememory,declaresharedvariablesoutsidethe scope of anyfunctions • Divideupthe computationfairlyamongthethreads • Jointhreadssoweknowwhentheyaredone

  31. Summary of today’slecture • Thegoalofparallelprocessingistoimprovesomeaspectof performance • Themulticoreprocessor has multipleprocessingcores sharingmemory,the consequenceoftechnologicalfactors • We willemploymultithreading in thiscourseto “parallelize”applications • We will use the C++threadslibrary tomanage multhreading

  32. NextTime • Multithreading • Be sure your clicker isregistered • By Friday at 6pm: do Assignment#0 • cseweb.ucsd.edu/classes/wi17/cse160-a/HW/A0.html • Establish that you can login to bang andieng6 • cseweb.ucsd.edu/classes/wi17/cse160-a/lab.html

More Related