מבוא לעיבוד מקבילי

מבוא לעיבוד מקבילי הרצאה מס' 4 12.11.2001

נושאים • MPE • Embarrassingly Parallel Computations • Fractals • Monte Carlo • Non-Interacting and Interacting Particle Simulation (home work) • Partitioning and Divide-and-Conquer • Homework: Assignment #2

שעת קבלה • ימי שני, מיד לאחר השיעור, בין 11:00 ל- 12:00, בחדר 126 בבניין הנדסת חשמל ומחשבים (ליד המזכירות).

בוחן ביניים • יתקיים ביום שישי, ה- 7/12/2001 • השעה תרם נקבעה • משך הבוחן עד שעתיים

פרוייקט גמר • להציע נושאים לפרוייקטים • בהרצאה מס' 5 (שבוע הבא) תפורסם רשימת נושאים • תוך שבוע יש לסגור את בחירת הנושא • קבלת נושא מתוך הרשימה תהיה על בסיס "כל הקודם זוכה"

פרוייקט גמר - המשך • בשני השיעורים האחרונים יוצגו בכיתה הפרוייקטים על-ידי מבצעיהם. • הנכם מתבקשים להכין מצגת ב- Power-Point. • משך המצגת כ- 10 דקות • הציון בפרויקט מורכב מהציון שינתן בעת הצגת הנושא והמוצר הסופי שיוגש עד לתאריך ה- 18/2/2002.

פקודות MPE/MPI • תיעוד Online של הפקודות: http://www-unix.mcs.anl.gov/mpi/www/ • הספר MPI – The Complete Reference נמצא אף הוא Online: http://www.netlib.org/utk/papers/mpi-book/mpi-book.html

MPE – Multi-Processing Environment • A set of graphics routines located in a library called MPE. • Interface to X windows. • Any program that uses MPE will have to include “mpe.h” • Compilation: • mpicc fn.c -o fn -lmpe -lX11 -lm

MPE – Useful routines – 1/2 MPE_Open_graphics - create a graphics window MPE_Close_graphics - destroy a graphics window MPE_Draw_point - draw a point in a window MPE_Draw_points - draw a series of points in a window. (moderately faster than a series of MPI_Draw_point calls) MPE_Draw_line - draw a line in a window MPE_Fill_rectangle - draw a rectangle in a window MPE_Update - flush the buffer for a window

MPE – Useful routines – 2/2 MPE_Get_mouse_press - wait until the user presses a mouse button and return the press point. MPE_Get_mouse_status (in mouse_status.c) - get information about the mouse state MPE_Drag_square (in mouse_status.c) - let the user select a square on the screen MPE_Make_color_array - create a nice spectrum of colors

MPE Logging Routines MPE_Init_log must be called by all processes to initialize MPE logging data structures. MPE_Finish_log collects the log data from all the processes, merges it, and aligns the timestamps with respect to the times at which MPE_Init_log and MPE_Finish_log were called. Then, the process with rank 0 in MPI_Comm_world writes the log into the file whose name is given as an argument to MPE_Finish_log.

MPE Logging – An Example /usr/local/mpich/examples/basic/cpilog.c . if (myid == 0) { MPE_Describe_state(1, 2, "Broadcast", "red:vlines3"); MPE_Describe_state(3, 4, "Compute", "blue:gray3"); MPE_Describe_state(5, 6, "Reduce", "green:light_gray"); MPE_Describe_state(7, 8, "Sync", "yellow:gray"); } .

MPE Logging – An Example בהמשך התכנית… MPI_Barrier(MPI_COMM_WORLD); MPE_Start_log(); MPE_Log_event(1, 0, "start broadcast"); MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD); MPE_Log_event(2, 0, "end broadcast"); בסוף התכנית: MPE_Stop_log(); MPE_Finish_log("cpilog.log"); MPI_Finalize();

Error Messages SIGABRT - Abnormal termination of the program (such as a call to abort). SIGFPE - An erroneous arithmetic operation, such as a divide-by-zero or an operation resulting in overflow SIGILL - Detection of an illegal instruction SIGINT - Receipt of an interactive attention signal SIGSEGV - An invalid access to storage SIGTERM - A termination request sent to the program

הבנה נוספת של מודי התקשורת 4 השקפים הבאים הם מתוך: http://www.rug.nl/hpc/people/arnold/MPI

Send/Recv

תקשורת קולקטיבית

Barry Wilkinson and Michael Allen Transparencies (Figures and Text) קבצי PDF: http://www.cs.uncc.edu/~abw/parallel/par_prog/ resources/transnew.html

Embarrassingly ParallelComputations Chapter 3 from: Parallel Programming: Techniques and Applications using Networked Workstations and Parallel Computers Barry Wilkinson and Michael Allen Ó Prentice Hall, 1999. There is a link in the course web site to the book web site.

Embarrassingly ParallelComputations Go to PDF presentation

תרגיל:Monte Carlo Calculation of p מטרות התרגיל: 1.שימוש ב- Communicatorים שונים 2. דוגמא ל- Client/Server 3. המחשת נושא ה- Embarrassingly Parallel Computations Reference: “Using MPI”, W. Gropp, Ewing Lusk, Anthony Skjellum Chapter 3. monte.cקובץ המקור:

MPI_GROUP_EXCL The MPI_GROUP_EXCL routine creates a new group from an existing group and specifies member processes (by exclusion).

Top 500 • Updated list (November 10th, 2001) • SC2001 Conference • www.top500.org

Count Share Rmax [GF/s] Rpeak [GF/s] Processors Industry 261 52.2 % 38155.44 57191 41140 Research 112 22.4 % 57738.71 85150 78802 Academic 84 16.8 % 29159.18 40706 27936 Classified 29 5.8 % 6625.14 10308 9316 Vendor 12 2.4 % 3045.94 4408 4112 Government 2 0.4 % 253.10 372 368 Total 500 100 % 134977.51 198135 161674 Installation Type

Installation Area Count Share Rmax [GF/s] Rpeak [GF/s] Processors N/A 242 48.4 % 75005.67 107946 92337 Telecomm 56 11.2 % 7494.04 10958 7646 Finance 33 6.6 % 5016.94 7193 5810 Weather 26 5.2 % 11627.00 16130 11656 Automotive 18 3.6 % 2290.10 3885 2668 Database 16 3.2 % 2794.60 4031 2674 Transportation 16 3.2 % 2195.96 3442 2110 Electronics 13 2.6 % 1550.70 2243 1452 Geophysics 13 2.6 % 2354.70 3864 2774 Aerospace 13 2.6 % 3840.40 5275 4775 Energy 10 2 % 12222.90 20447 17416 WWW 9 1.8 % 1873.14 2909 2912 Infor. Service 6 1.2 % 740.40 1057 480 Chemistry 6 1.2 % 1020.10 1698 1996 In.Pr. Service 6 1.2 % 876.80 1307 768 Manufacturing 6 1.2 % 937.56 1356 982 Mechanics 5 1 % 1092.70 1578 1418 Pharmaceutics 2 0.4 % 701.00 1008 672 Benchmarking 2 0.4 % 690.60 875 536 Defense 1 0.2 % 553.00 792 528 Software 1 0.2 % 99.20 141 64 Total 500 100 % 134977.51 198135 161674

Count Share Rmax [GF/s] Rpeak [GF/s] Processors HP SPP 153 30.6 % 19867.90 29325 13432 IBM SP 151 30.2 % 48294.01 74345 58886 SGI Origin 39 7.8 % 10288.00 15025 22748 T3E/T3D 39 7.8 % 14083.17 21015 22378 NOW 29 5.8 % 8799.07 15227 14637 Sun UltraHPC 24 4.8 % 3050.16 4265 5352 Fujitsu VPP 16 3.2 % 5469.60 5932 1052 NEC Vector 15 3 % 3726.40 3936 668 Hitachi SR8xxx 14 2.8 % 8092.40 9750 1892 Compaq AlphaServer 13 2.6 % 9917.10 14079 7825 IBM S80s 2 0.4 % 306.80 723 804 pp2 2 0.4 % 220.00 475 256 prim 1 0.2 % 115.70 217 64 Hitachi SR2xxx 1 0.2 % 368.20 614 2048 intel Paragon 1 0.2 % 2379.00 3207 9632 Total 500 100 % 134977.51 198135 161674 Computer Family

Count Share Rmax [GF/s] Rpeak [GF/s] Processors USA 230 46 % 77848.11 120172 109681 Germany 59 11.8 % 12054.08 18067 14734 Japan 57 11.4 % 18383.52 22957 11896 UK 34 6.8 % 7236.12 9987 7038 France 23 4.6 % 3892.40 5225 3565 Korea 17 3.4 % 2478.60 3509 1740 Italy 11 2.2 % 1348.43 1884 1278 Canada 11 2.2 % 1622.40 2168 1681 Netherlands 10 2 % 1835.00 2503 2190 Sweden 5 1 % 897.34 1395 1480 Australia 4 0.8 % 1206.51 1573 800 Finland 4 0.8 % 809.40 1161 924 Taiwan 4 0.8 % 634.90 825 385 Austria 3 0.6 % 311.20 447 268 Venezuela 3 0.6 % 299.00 423 192 Saudi Middle East 2 0.4 % 772.00 1080 720 Norway 2 0.4 % 240.20 317 284 Switzerland 2 0.4 % 238.90 320 256 China 2 0.4 % 199.10 282 128 Spain 2 0.4 % 231.20 333 192 Brazil 2 0.4 % 392.50 564 256 Denmark 2 0.4 % 230.00 333 222 Mexico 2 0.4 % 218.00 294 256 Belgium 2 0.4 % 295.40 423 192 Singapore 1 0.2 % 99.20 141 64 Luxembourg 1 0.2 % 137.10 204 256 Puerto Rico 1 0.2 % 99.90 141 64 New Zealand 1 0.2 % 115.90 168 140 Portugal 1 0.2 % 301.00 444 296 Hong Kong 1 0.2 % 441.00 636 424 Russian Federation 1 0.2 % 109.10 159 72 Total 500 100 % 134977.51 198135 161674 Countries

Count Share Rmax [GF/s] Rpeak [GF/s] Processors USA/Canada 241 48.2 % 79470.51 122340 111362 Europe 162 32.4 % 30166.87 43202 33247 Japan 57 11.4 % 18383.52 22957 11896 South-East Asia 25 5 % 3852.80 5393 2741 South America 8 1.6 % 1009.40 1422 768 Australia 5 1 % 1322.41 1741 940 Middle East 2 0.4 % 772.00 1080 720 Total 500 100 % 134977.51 198135 161674 Continents

Highlights from the Top 10 • ASCI White is again #1 with unchanged performance of 7.2 TF/s on the Linpack • Three systems are new in the TOP10: • The number of systems exceeding the 1 TFflop/s mark on the Linpack up to 16 • 30 systems have peak performance above 1 TFlop/s including two self-made clusters: • CPlant at Sanida at #30 and • Titan at NCSA at #31 • 1.29 Tflop/s is the entry point for the Top 10 • 5 of the TOP10 systems are from IBM, 2 from Compaq, and one from Intel, Hitachi, and SGI. • 8 of the TOP10 systems are installed in the US, and one in Japan and Germany each.

General highlights from the Top 500 • The new entry level of 94.3 GF/s would have been enough to be listed on position 295 in the last TOP500 just 6 month ago • Total accumulated performance is 134.4 TFlop/s compared to 108.8 TFlop/s 6 month ago • Entry level is now 94.3 GF/s compared to 67.8 GF/s 6 month ago In June 1993 no system exceeded this limit. From the Nov 1993 list only the "Numerical Wind Tunnel" at NAL in Japan exceeded this limit. This system is actually still on the list with improved hardware and performance at position 130. • The entry point for the top100 moved from 241 GF/s to 300 GF/s. • The first Itanium cluster is at the NCSA at #39 with 678 GF/s. • The first Windows2000 cluster is at the Cornell Theory Center at #320 with 121 GF/s.

Partitioning and Divide-and-Conquer Go to PDF presentation

תרגיל בית מס' 2 • יש להגישו עד לתאריך ה- 26.11.2001 (הרצאה מס' 6) – תוך שבועיים. • משקלו כ- 15% בציון הסופי.

תרגיל בית מס' 2 • מטרה: • תרגול הסביבה הגראפית MPE יחד עם תכנית MPI מקבילית • תרגול : Embarrassingly Parallel Computation • תרגול סימולציה של תנועת חלקיקים

תרגיל בית מס' 2 - המשך • נושא התרגיל:סימולציה של חלקיקים • התרגיל מורכב מ-3 חלקים

תרגיל בית מס' 2 - המשך חלק 1 – עבודה גראפית בסביבת MPE הרץ את התכנית pmandel הנמצאת ב- /usr/local/mpich/mpe/contrib/mandel/pmandel דרוש לעבוד בסביבה גראפית. כלומר לעבוד עם X-windows. קרא את ה- Readme והסתכל בקבצי המקור כדי ללמוד את פקודות ה-MPE הבסיסיות.שחק עם קלטים שונים ובצע zoom על התמונה עם העכבר

תרגיל בית מס' 2 - המשך

תרגיל בית מס' 2 - המשך • חלק 2 – סימולציה של חלקיקים בקופסה • החלקיקים נעים בקופסה ומוחזרים מהקירות • בשלב זה אין אינטראקציה בין החלקיקים • סימולציה של פוטונים בין מראות • התנגשות אלסטית עם המראות • צייר את כדורים כאשר כל כדור נשלט על-ידי מעבד אחר • השתמש בגרפיקה של MPE כדי לצייר את הפתרון ע"ג חלון X

תרגיל בית מס' 2 - המשך • הבעיה דו-ממדית • השתמש במספר מועט של חלקיקים (5~) • צבע כל חלקיק בצבע שונה

תרגיל בית מס' 2 - המשך

תרגיל בית מס' 2 - המשך • חלק 3: הוספת אינטראקציה בין החלקיקים. • סימולציה של כדורי ביליארד • ההתנגשויות אלסטיות • תנאים מקלים: לכל החלקיקים אותה מסה ואותה מהירות (בערכה המוחלט)

תרגיל בית מס' 2 - המשך הבהרות נוספות לגבי התרגיל

תרגיל בית מס' 2 - המשך • צבע שונה לכל כדור • הפעם יש תקשורת בין המעבדים • נקודה למחשבה:חלוקת המרחב למעבדים לעומת הקצאת מעבדים לחלקיקים מסוימים

תרגיל בית מס' 2 - המשך פיזור אלסטי של חלקיקים בעלי מסות שוות וגודל מהירות שווה מערכת מרכז המסה מערכת המעבדה

תרגיל בית מס' 2 - המשך במערכת מרכז המסה לחלקיקים מהירות בכיוון y בלבד. לאחר ההתנגשות הם הופכים כיוון

תרגיל בית מס' 2 - המשך נסמן: V מהירות החלקיק. R מטריצת סיבוב ממערכת המעבדה למערכת מרכז המסה – CM. R-1 הסיבוב חזרה למערכת המעבדה. C פעולת ההתנגשות במערכת CM.

תרגיל בית מס' 2 - המשך כדי לתאר את וקטור המהירות לאחר ההתנגשות יש להפעיל על וקטור המהירות שלפני ההתנגשות את שלוש הטרנספורמציות בזו אחר זו: v(after) = R-1 C R v(before)

תרגיל בית מס' 2 - המשך מהי הזווית בה CM מסובבת ביחס למעבדה?

תרגיל בית מס' 2 - המשך מי שיעשה את החישוב R-1 C R, כאשר המסות שוות וגודל המהירויות שווה, יגלה כי משמעות הפיזור האלסטי היא החלפה בין הזויות של זוג החלקיקים המתנגשים!!! 5 נקודות בונוס (ציון מגן) למי שיכתוב תכנית מחשב לפיזור אלסטי כללי עבור מסות שונות ומהירויות שונות!

מבוא לעיבוד מקבילי

מבוא לעיבוד מקבילי

Presentation Transcript