140 likes | 153 Views
This presentation gives an overview of the Apache Singa AI project. It explains Apache Singa in terms of it's architecture, distributed training and functionality. <br> <br>Links for further information and connecting<br><br>http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/<br><br>https://nz.linkedin.com/pub/mike-frampton/20/630/385<br><br>https://open-source-systems.blogspot.com/
E N D
What Is Apache Singa AI ? ● It's a machine learning system ● Open source Apache 2.0 license ● It supports distributed processing ● It supports a wide range of hardware ● It scales well, for instance as it's GPU cluster grows ● By using computational graphs it optimizes – Training speed / memory footprint ● It offers a simple software stack and Python interface
Singa Related Terms ●Cuda Parallel computing platform and API model created by Nvidia ● Tensor A generalized matrix possibly in multiple dimensions ● ONNX Open representation format for machine learning models ● MPI Message passing interface ● NCCL NVIDIA collective communications library ● Autograd Provides automatic differentiation for all operations on Tensors ● DAG Directed acyclic graph ● SWIG Simplified wrapper and interface generator ● OpenCL A framework for writing programs for heterogeneous platforms ● SGD Stochastic Gradient Descent
Singa Software Stack ● Divided into two levels ● Python interface provides – Data structures – Algorithms for machine learning models ● Backend – Data structures for deep learning models – Hardware abstractions for scheduling/executing operations – Communication components for distributed training
Singa Computational Graph ● Consider Tensor's to be edges and operators to be nodes ● A computational graph or DAG is formed ● Singa uses this DAG to optimize / schedule – Execution of operations – Memory allocation/release ● Define neural net with Singa Module API ● Singa then automatically – Constructs computational graph – Optimizes graph for efficiency
Singa Computational Graph Optimization ● Current graph optimizations include ● Lazy allocation – Tensor/block memory allocated at first read ● Automatic recycling – Automatic recycling of memory block – When read count hits zero – Except for blocks that are used outside the graph ● Memory sharing – Allows sharing of memory among tensors – Optimization of memory operations
Singa Device ● Device abstract represents a hardware device – Has memory and computing units – Schedules all tensor operations ● Singa has three Device implementations – CudaGPU for a Nvidia GPU card - runs Cuda code – CppCPU for a CPU - runs Cpp code – OpenclGPU for a GPU card - runs OpenCL code
Singa Distributed Training ● Data parallel training across multiple GPUs ● Each process/worker runs GPU based training ● Each process has an individual communication rank ● Training data partitioned among the workers ● Model is replicated on every worker, In each iteration – Workers read a mini-batch of data from its partition – Run the BackPropagation algorithm to compute the gradients – Which are averaged via all-reduce (provided by NCCL) – For weight update following SGD algorithms
Singa Distributed Training – All Reduce ● NCCL all-reduce takes the training gradients from each GPU ● Averages all the ranked gradients to produce ● An averaged result which should match if ● All training carried out on a single GPU ● It should be noted that this type of distributed learning – Scales well and reduces processing time – But does not appear to be GPU failure tolerant – Does not provide GPU anonymity – i.e. We have to decide how many GPU's we use – It is not a sub system like Mesos – Distributing anonymous tasks to workers
Available Books ● See “Big Data Made Easy” Apress Jan 2015 – See “Mastering Apache Spark” ● Packt Oct 2015 – See “Complete Guide to Open Source Big Data Stack ● “Apress Jan 2018” – ● Find the author on Amazon www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ – Connect on LinkedIn ● www.linkedin.com/in/mike-frampton-38563020 –
Connect ● Feel free to connect on LinkedIn –www.linkedin.com/in/mike-frampton-38563020 ● See my open source blog at open-source-systems.blogspot.com/ – ● I am always interested in – New technology – Opportunities – Technology based issues – Big data integration