1 / 56

Opening Keynote at GTC 2015: Leaps in Visual Computing

NVIDIA CEO and co-founder Jen-Hsun Huang took the stage for the GPU Technology Conference in the San Jose Convention Center to present some major announcements on March 17, 2015. You'll find out how NVIDIA is innovating in the field of deep learning, what NVIDIA DRIVE PX can do for automakers, and where Pascal, the next-generation GPU architecture, fits in the new performance roadmap.

nvidia
Download Presentation

Opening Keynote at GTC 2015: Leaps in Visual Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LEAPS IN VISUAL COMPUTING J EN-HSUN HUANG, CO-FOUNDER & CEO | GTC 2015

  2. FOUR ANNOUNCEMENTS A Very Fast Box and Deep Learning Self-Driving Cars and Deep Learning A New GPU and Deep Learning Roadmap Reveal and Deep Learning

  3. AMAZING YEAR IN VISUAL COMPUTING © 2015 Industrial Light & Magic. All Rights Reserved.

  4. 10X GROWTH IN GPU COMPUTING 2008 150,000 CUDA Downloads 27 CUDA Apps 60 Universities Teaching 4,000 Academic Papers 6,000 Tesla GPUs 77 Supercomputing Teraflops

  5. 10X GROWTH IN GPU COMPUTING 2008 2015 3 Million CUDA Downloads 150,000 CUDA Downloads 27 CUDA Apps 60 Universities Teaching 4,000 Academic Papers 6,000 Tesla GPUs 77 Supercomputing Teraflops

  6. 10X GROWTH IN GPU COMPUTING 2008 2015 3 Million CUDA Downloads 150,000 CUDA Downloads 319 CUDA Apps 27 CUDA Apps 60 Universities Teaching 4,000 Academic Papers 6,000 Tesla GPUs 77 Supercomputing Teraflops

  7. 10X GROWTH IN GPU COMPUTING 2008 2015 3 Million CUDA Downloads 150,000 CUDA Downloads 319 CUDA Apps 27 CUDA Apps 60 800 Universities Teaching Universities Teaching 4,000 Academic Papers 6,000 Tesla GPUs 77 Supercomputing Teraflops

  8. 10X GROWTH IN GPU COMPUTING 2008 2015 3 Million CUDA Downloads 150,000 CUDA Downloads 319 CUDA Apps 27 CUDA Apps 60 800 Universities Teaching Universities Teaching 60,000 Academic Papers 4,000 Academic Papers 6,000 Tesla GPUs 77 Supercomputing Teraflops

  9. 10X GROWTH IN GPU COMPUTING 2008 2015 3 Million CUDA Downloads 150,000 CUDA Downloads 319 CUDA Apps 27 CUDA Apps 60 800 Universities Teaching Universities Teaching 60,000 Academic Papers 4,000 Academic Papers 450,000 Tesla GPUs 6,000 Tesla GPUs 77 Supercomputing Teraflops

  10. 10X GROWTH IN GPU COMPUTING 2008 2015 3 Million CUDA Downloads 150,000 CUDA Downloads 319 CUDA Apps 27 CUDA Apps 60 800 Universities Teaching Universities Teaching 60,000 Academic Papers 4,000 Academic Papers 450,000 Tesla GPUs 6,000 Tesla GPUs 54,000 Supercomputing Teraflops 77 Supercomputing Teraflops

  11. TITAN X THE WORLD’S FASTEST GPU 8 Billion Transistors 3,072 CUDA Cores 7 TFLOPS SP / 0.2 TFLOPS DP 12GB Memory

  12. TITAN X FOR DEEP LEARNING Training AlexNet 43 ~ … 7 6 Days 5 4 3 2 1 0 16-core Xeon CPU TITAN TITAN Black cuDNN TITAN X cuDNN

  13. TITAN X THE WORLD’S FASTEST GPU 8 Billion Transistors 3,072 CUDA Cores 7 TFLOPS SP / 0.2 TFLOPS DP 12GB Memory $999

  14. FOUR ANNOUNCEMENTS A Very Fast Box and Deep Learning Self-Driving Cars and Deep Learning A New GPU and Deep Learning Roadmap Reveal and Deep Learning

  15. A SHORT HISTORY OF DEEP LEARNING Accuracy % DNN 84% CV 74% 72% 2010 2011 2012 2013 2014 Convolutional Neural Networks for Handwritten Digital Recognition LECUN, BOTTOU, BENGIO, HAFFNER, 1998 ImageNet Classification with NVIDIA GPUs KRIZHEVSKY, HINTON, ET AL., 2012 1995 2000 2005 2010 2015

  16. “ Deep Image: Scaling up Image Recognition” —Baidu: 5.98% , J an. 13, 2015 IMAGENET CHALLENGE “ Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification” Accuracy % DNN —Microsoft: 4.94% , Feb. 6, 2015 84% CV 74% “ Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariant Shift” 72% 2010 2011 2012 2013 2014 —Google: 4.82% , Feb. 11, 2015

  17. THE BIG BANG

  18. DEEP LEARNING VISUALIZED

  19. GPU-ACCELERATED DEEP LEARNING START-UPS

  20. DEEP LEARNING REVOLUTIONIZING MEDICAL RESEARCH Predicting the Toxicity of New Drugs —J ohannes Kepler University Detecting Mitosis in Breast Cancer Cells —IDSIA Understanding Gene Mutation to Prevent Disease —University of Toronto

  21. “ Automated Image Captioning with ConvNets and Recurrent Nets” — Andrej Karpathy, Fei-Fei Li

  22. USER INTERFACE Monitor Progress Process Data Configure DNN Visualize Layers DIGITS DEEP GPU TRAINING SYSTEM FOR DATA SCIENTISTS Theano Torch Caffe cuDNN, cuBLAS Design DNNs CUDA Visualize activations Manage multiple trainings GPU HW GPU Multi-GPU GPU Cluster Cloud

  23. DIGITS Process Data Configure DNN Monitor Progress Visualize Layers Test Image

  24. DIGITS DEVBOX World’s fastest GPU Max GPU out of a plug Multi-GPU training & inference

  25. DIGITS DEVBOX —EARLY RESULTS “ DIGITS makes it way easier to design the best network for the job” “ I’ve never seen AlexNet run this fast…TitanX is a monster, Crazy Fast” Multi-GPU scaling on Torch 4x AlexNet VGG 3x —Simon Osindero —Soumith Chintala 2x A.I. Architech Research Engineer 1x 0x 1 2 4

  26. DIGITS DEVBOX Available May 2015 $15,000

  27. FOUR ANNOUNCEMENTS A Very Fast Box and Deep Learning Self-Driving Cars and Deep Learning A New GPU and Deep Learning Roadmap Reveal and Deep Learning

  28. 72 Volta 60 48 Pascal Mixed Precision 3D Memory NVLink GPU ROADMAP Pascal 2x SGEMM/W SGEMM / W 36 24 Maxwell 12 Kepler Fermi Tesla 0 2008 2010 2012 2014 2016 2018

  29. 60 Volta 50 Frame Buffer Capacity (GB) 40 GPU ROADMAP Pascal 2.7x Memory Capacity Pascal Mixed Precision 3D Memory NVLink 30 20 Maxwell 10 Kepler Fermi Tesla 0 2008 2010 2012 2014 2016 2018

  30. 144 Volta 120 96 Pascal Mixed Precision 3D Memory NVLink GPU ROADMAP Pascal 4x Mixed Precision HGEMM / W 72 48 24 Maxwell Kepler Fermi Tesla 0 2008 2010 2012 2014 2016 2018

  31. 900 Volta Pascal Mixed Precision 3D Memory NVLink 750 600 GPU ROADMAP Pascal 3x Bandwidth STREAM GB/s 450 300 Maxwell Kepler 150 Fermi Tesla 0 2008 2010 2012 2014 2016 2018

  32. PASCAL 10X MAXWELL forward backward CONVOLUTION (compute) FULLY CONNECTED (bandwidth) FULLY CONNECTED (bandwidth) CONVOLUTION (compute) WEIGHT UPDATE (interconnect) 5x 2x 4x (FP16) 6x 6x 4x 10x Mixed Precision 3D Memory 3D Memory Mixed Precision NVLINK * Very rough estimates

  33. FOUR ANNOUNCEMENTS A Very Fast Box and Deep Learning Self-Driving Cars and Deep Learning A New GPU and Deep Learning Roadmap Reveal and Deep Learning

  34. TODAY’S ADAS SENSE PLAN ACT WARN BRAKE FPGA CV ASIC CPU

  35. NEXT-GENERATION ADAS SENSE PLAN ACT WARN BRAKE FPGA CV ASIC CPU STEER ACCELERATE

  36. NVIDIA DRIVE PX SELF-DRIVING CAR COMPUTER IMAGENET CHALLENGE SENSE PLAN ACT Accuracy % WARN FPGA CV ASIC CPU DNN BRAKE 84% CV STEER 74% 72% ACCELERATE DNN 2010 2011 2012 2013 2014

  37. NVIDIA DRIVE PX SELF-DRIVING CAR COMPUTER IMAGENET CHALLENGE SENSE PLAN ACT Accuracy % WARN FPGA CV ASIC CPU DNN BRAKE 84% CV STEER 74% 72% ACCELERATE DNN 2010 2011 2012 2013 2014

  38. NVIDIA DRIVE PX SELF-DRIVING CAR COMPUTER IMAGENET CHALLENGE SENSE PLAN ACT Accuracy % WARN FPGA CV ASIC CPU DNN BRAKE 84% CV STEER 74% 72% ACCELERATE DNN 2010 2011 2012 2013 2014

  39. NVIDIA DRIVE PX SELF-DRIVING CAR COMPUTER IMAGENET CHALLENGE SENSE PLAN ACT Accuracy % WARN FPGA CV ASIC CPU DNN BRAKE 84% CV STEER 74% 72% ACCELERATE DNN 2010 2011 2012 2013 2014

  40. NVIDIA DRIVE PX SELF-DRIVING CAR COMPUTER IMAGENET CHALLENGE SENSE PLAN ACT Accuracy % WARN FPGA CV ASIC CPU DNN BRAKE 84% CV STEER 74% 72% ACCELERATE DNN 2010 2011 2012 2013 2014

  41. NVIDIA DRIVE PX SELF-DRIVING CAR COMPUTER IMAGENET CHALLENGE SENSE PLAN ACT Accuracy % WARN FPGA CV ASIC CPU DNN BRAKE 84% CV STEER 74% 72% ACCELERATE DNN 2010 2011 2012 2013 2014

  42. PROJ ECT DAVE —DARPA AUTONOMOUS VEHICLE DNN-based self-driving robot IMAGENET CHALLENGE Training data by human driver Accuracy % No hand-coded CV algorithms DNN 84% PROJ ECT LEADS CV 74% Urs Muller: Chief Architect, Autonomous Driving, NVIDIA 72% 2010 2011 2012 2013 2014 Yann LeCun: Director, AI Research, Facebook

  43. DAVE IN ACTION

  44. TRAINING DATA 225K Images

  45. TEST DRIVE No Training

  46. TEST DRIVE Partially Trained (52K images)

  47. TEST DRIVE Fully Trained (225K images)

  48. AlexNet on DRIVE PX DAVE Number of Connections 630 Million 3.1 Million Frames / Second 184 12 Connections / Second 116 Billion 38 Million 3,000x Faster

More Related