420 likes | 573 Views
Parallel Computing Models & Techniques. About Me. Microsoft MVP Intel Blogger TechEd Israel, TechEd Europe Expert C++ Book http ://AsyncOp.com http://Asaf.Shelly.co.il. Parallel Computing. Multi-Core Distributed Systems SOA & WebServices Transaction, Session, Queue, Event, Interrupt
E N D
About Me • Microsoft MVP • Intel Blogger • TechEd Israel, TechEd Europe • Expert C++ Book • http://AsyncOp.com • http://Asaf.Shelly.co.il
Parallel Computing • Multi-Core • Distributed Systems • SOA & WebServices • Transaction, Session, Queue, Event, Interrupt • User Experience over User Interface • Maximize performance: No Free Work Unit • Best performance: No I/O Wait
Advantages of Multi-Core • Low Power Consumption • Extended battery life • Less heating • Smaller and lighter devices • Software replaces custom hardware!
Signaling In Hardware Request: Read Address Error Detection RAM Wait: Preparing Data Response: Data CPU
Signaling In Hardware Interrupt : Data Pending Error Detection RAM Interrupt : Processing Complete CPU
Software Locks? • Locks Are BAD! • By design a lock is forcing serial work • Using a resource on a single core • Use a lock only when you want to use 1 core at a time and eliminate parallel work • Locks can be used on single steps for example entrance to a queue
Locks Are BAD! • Can you find the bug?? Lock( MUTEX_A ) Buffer_A [ 12 ] = 23; // here Buffer_A [ 12 ] is 57 !!!! Unlock( MUTEX_A ) • Would you find it with a code review?
Protecting A Resource • Lock as way to share ownership • Using a single owner • Owner Thread • Owner Task • TPL Agent • Device Driver • Owner Service
Asynchronous Work Without Locks • Phone as Synchronous System • Phone as Asynchronous System • Mail and Email System • Order Pizza
Unprotected Parallel Access To Data • Two Writers or Writer and Reader Writer Writer Reader A A A A A A A A A A A A A A A A A
Race Condition - Location • Two Writers or Writer and Reader Writer Writer Reader A A A A A A A A A A A A A A A A A
Race Condition – Timeframe • Collision over the same communication line Writer Writer Reader A
Race Condition - Sequence • Bugs in Parallel Pipeline 123 123 Add X Add ‘1’ To ASCII Clear Buffer Add ABCDE Add X Add ABCDE Clear Buffer Add ‘1’ To ASCII 123ABCDE ABCDE ABCDEX BCDEFY X TCP, CJP, PF
Race Condition Solutions • Wave-In Signal – Manager (ex. USB BUS) • Pass Ownership (Token Ring, MUTEX) • TDM • Burst Write, Retry Read (ex. SeqLock, Reader-Writer Lock, Network Layer 2) • Write and Verify • Queue • Transaction – A Sequence
Serial Problem with Communication • Transaction based Ping-Pong Computer Packet Request USB Device Packet Data Acknowledge Packet Request Packet Data Acknowledge
Parallel Solution for Ping-Pong • Collected Transaction Computer USB Device Request List Packet Data A Packet Data B Packet Data C Retransmit B Packet Data D Ack List
Cancel Operation • Search For File Computer USB Device Request List Packet Data A Packet Data B Packet Data C Acknowledge A Packet Data D Abort Data found in B Packet Data E Acknowledge
Object Oriented Design • Definition Of Objects • Object Relations • Object Reusability • Object Management • Object Oriented Block Diagram • Object Oriented System Design • Avoid “Spaghetti Code”
Procedural Design • Definition Of State • Procedure Relations • Procedure Reusability • Flow Control Management • Poor Block Diagram • Limited System Design • Avoid “Spaghetti Flow”
Business Logic User Interface Infrastructure Parallel Serial Parallel Good Application Design
Asp.Net Browser Web Server Parallel Serial Parallel Good Application Design
Queue • Pass Data Without Using Lock • Full Asynchronous Operation • Event With Data • Event With Priority • Event With Destination • Structured Event vs. Stream
Flow Control • Keep Internal State • Object State • Execution Phase • Collection of State as System State • System State for Debug
Task Management • Stack – Hardware Accelerated Management • Fork • Software Stack Management • Session • Task Groups
Software Dispatcher • .Net Parallel Extensions
Network Dispatcher Load Balance Front End • Load Balancing Firewall Firewall Firewall Load Balance Front End
Hardware Dispatcher • 10 Gbps Network Switch 2.67 GHz CORE 2.67 GHz CORE 2.67 GHz CORE 2.67 GHz CORE 10 GHz Network Dispatcher / MUX
Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker GUI Dispatcher Business Task Logic Interface Application Dispatcher Cloud Dispatcher • Microsoft Server 2008 HPC
Task Oriented Design Operation: Setting up a Tent A Task: locate items in storage B Task: carry items to build site C Task: use items to build tent
A B C A B C Execution Timeline Locate Carry Use Output Wires Fabric A B C Pole Time
A B C A B C Horizontal Division A B C Time
A B C A B C Vertical Division A B C Time
Force Duplication • Entire Process • Sharing Resources • Flow Barriers • Simple to implement • Simple Affinity • Simple Priority • No Optimization
Pipeline • Functional • Resources Ownership • Communication Barriers • Requires Design • Affinity Planning • Priority Planning • Optimization
Super Networks and Grids • Multiple Reads • Multiple Writes • Replication Time • Replication Overhead • Network Consistency • Data Snapshot • Real Time