130 likes | 156 Views
Explore the Task Parallel Library (TPL) API for concurrent processing in C#. Understand the concepts of data parallelism vs. task parallelism and their applications through examples such as password cracking using dictionary attacks.
E N D
Task Parallel Library (TPL) Higher level abstraction API for concurrency in C# Task parallel library
Data parallelism vs. task parallelismTwo ways to partition a “problem” into tasks Data parallelism (master-slave) Task parallelism (pipelining) Partition the algorithm, and give each task a part the algorithm. Works well if each part of the algorithm takes about the same time to execute • Partition the data, and give each task a part of the data set. • Works well if data are independent • Master • Coordinates the work • ONE master • Slaves • do the work • Many slaves Task parallel library
Example: Password cracking, dictionary attack • A brute force algorithm • We need lots of computer resources to make it run fast • We have a list of usernames + encrypted passwords • We know which encryption algorithm was used • There is no known decryption algorithm • We want to find (some of) the passwords in clear text. • We assume that (some) users have a password that is present in a dictionary • Or is a variation of a word from a dictionary • General algorithm • Read a word from the dictionary • Make a lot of variations from the word, like 123word, word22, Word, etc. • Encrypt all the variations • Compare each encrypted word to all the encrypted passwords from the password file • If there is a match, we have found a password Task parallel library
Example: Password cracking, dictionary attack Data parallelism Task parallelism (pipelining) The algorithm is divided into a number of steps. Each task executes a step in the algorithm Each step sends data to the next step Lots of communication between tasks • The dictionary is divided into a number of sub-sets. • Each sub-set is give to a task that performs all steps of the algorithm • No communication between tasks Task parallel library
Task Parallel Library (TPL) • The “Task Parallel Library (TPL)” is part of the C# API • Namespace System.Threading.Task • Some interesting TPL classes • Task • For Task parallelism • We have used Task.run(Action action) to ask the thread pool to run a thread • But there is much more to the Task class … • Parallel • For Data parallelism (and Task parallelism) Task parallel library
Class Task, some methods and properties • Task task = Task.run(Action action) • Action is a void method, no parameters: void M() { … } • Task task = new Task(Action action) • taskObject.start() • Runs the previously created taskObject in the thread pool • Properties IsCompleted, IsFaulted, IsCancelled • Property Id • Task.CurrentId Task parallel library
Task with return values • Task<TResult> task = Task.run( Func<TResult> function) • Runs the action in the thread pool • Func (function) is a method that returns TResult, no parameters • Task<TResult> task = new Task(Func<TResult> function) • Constructor • The Result property • Contains the result of the task • Blocks until the result is ready Task parallel library
The class Parallel • Very high level of abstraction • Level 1: Thread • Level 2: Task • Level 3: Parallel • Parallel.Invoke(action, action, …, action) • Actions are invoked in parallel. • Returns when the last action has finished • Task parallelism: Different tasks run in parallel. • Tasks are generally not similar • Usually a few (large) tasks • Efficient when actions need almost same amount of time to complete • Example: Gaston Hillar: Professional Parallel Programming with C#, example 2_3 Task parallel library
Parallel for loop • Parallel.For(fromValue, toValue, Action<int>) • Similar to an ordinary for loop – but parallel, of course. • Action is invoked for each value between fromValue (inclusive) and toValue (not inclusive) • Data parallelism: Each task is given a part of the data set . Tasks are similar. • Usually a lot of (small) tasks. • Efficient even if the tasks needs different times to complete. • Example: Gaston Hillar: Professional Parallel Programming with C# • example 2_5 Task parallel library
Parallel foreach loop • Parallel.ForEach(IEnumrable list, Action) • Similar to ordinary foreach loop – but parallel. • Action is invoked for each element in the IEnumerable (think “list” or “array”) • Example: Gaston Hillar: Professional Parallel Programming with C# • example 2_11 • Often used with a Partitioner • Partitioner divides a range of data into a number of sub-ranges Task parallel library
References and further readings • MSDNTask Parallel Library (TPL) • http://msdn.microsoft.com/en-us/library/dd460717(v=vs.110).aspx • Joseph Albahari • Threading in C#, Part 5: Parallel Programming, • The Parallel Class http://www.albahari.com/threading/part5.aspx#_The_Parallel_Class • Task Parallelism http://www.albahari.com/threading/part5.aspx#_Task_Parallelism • Gaston C. Hillar • Professional Parallel Programming with C#, Master Parallel Extensions with .NET 4, Wrox/Wiley 2011 • Chapter 2: Imperative Data Parallelism, page 29-72 • Chapter 3: Imperative Task Parallelism, page 73 – 102. Task parallel library