140 likes | 261 Views
Processing Framework. Sytse van Geldermalsen Masters Grid Computing – University of Amsterdam Internship at Amsterdam Medical Centre. Contents. OpenCL Concepts Problems Research and projects Processing Framework Example. OpenCL. Requires vendor support
E N D
Processing Framework Sytse van Geldermalsen Masters Grid Computing – University of Amsterdam Internship at Amsterdam Medical Centre
Contents • OpenCL • Concepts • Problems • Research and projects • Processing Framework • Example
OpenCL • Requires vendor support • ARM, AMD, Intel, Apple, Vivante Corporation, STMicroelectronics International NV, IBM Corporation, Imagination Technologies, Creative Labs, NVIDIA • Portable • Works on heterogeneous architecture • Provides great computational power http://www.khronos.org/opencl/
How much computational power? http://www.r-bloggers.com/cpu-and-gpu-trends-over-time/
Key Concepts • OpenCL - Runtime system • Kernel • Accelerated Device
OpenCL Kernel // Sequential c/c++ code for(intx=0;x<1024;x++) { for(inty=0;y<1024;y++) { matrix[x][y]=matrix[x][y]+1;// Code is run 1048576 times for this thread } } // Parallel kernel code kernelvoidMatrixIncrement(globalint**matrix) { intx=get_global_id(0); inty=get_global_id(1); matrix[x][y]=matrix[x][y]+1; // Code is run once for this thread }
Problems • Low level C/C++ Library • A lot of overhead code • Things can and will go wrong
Ease of OpenCL application development High Level Frameworks Increase ease of application development Tools: Debuggers, Profilers, Middleware/Library: Video, Imaging, Math/Sciences, Physics Wrappers C++, C#, Java, Python, Javascript OpenCL C Library Drivers and Hardware, CPU’s, GPU’s, Cell Processors, FPGA’s
High Level Frameworks • Research has been done in: • Scheduling multiple kernels on device • Overlapping memory transfers with kernel execution • Load balancing • Distribution over GPU’s on the grid • Task scheduling
Dataflow Processing Framework • In a nutshell: • Based on ideas of different research • Increase the ease of development • Uses the dataflow concept • Simplicity • Asynchronous overlapped data transfers and kernel executions
Conceptual Example Input B Input A Legend: Async Process Async memory xfer CPU Process GPU Process Data Dependency Data Output 1 2 3 4 5 6 Output A Output B
Programming with the Framework • Programmer defines a number of processes and data • The process uses a OpenCL kernel or a standard C/C++ function • User defines the arguments of the kernel with the defined data • These processes compute on user selected device: CPU/ GPU/FPGA… etc • Signal the framework to run
Programming Example Framework ProcessingFrameworkpf; ProcessingComponentone,two,three,four; DeviceMemoryArrayA,ArrayB,Output; ArrayA=pf.CreateInputMemory(mem_size); ArrayB=pf.CreateInputMemory(mem_size); Output=pf.CreateOutputMemory(mem_size); one=pf.CreateAPC(pf.GPUDevice(),"Sort"); two=pf.CreateAPC(pf.CPUDevice(),"Sort"); three=pf.CreateAPC(pf.GPUDevice(),"Filter"); four=pf.CreateAPC(pf.CPUDevice(),"Search"); one.SetArg(0,ArrayA); one.SetWorkSize(arr_size); two.SetArg(0,ArrayB); two.SetWorkSize(arr_size); three.SetDependency(one,0,ArrayA); three.SetDependency(two,1,ArrayB); three.SeWorkSize(arr_size); four.SetDependency(three,0,ArrayA); four.SetArg(1,Output); four.SetWorkSize(arr_size); pf.Run(); Array A Array B 1 Sort 2 Sort 3 Filter 4 Search Output