1 / 39

Introduction to Neural Networks

Immerse yourself in the world of Recurrent Neural Networks (RNN) with this comprehensive guide covering RNN fundamentals, BackPropagation Through Time (BPTT), GRADIENT calculations, and Bidirectional RNNs (BRNN). Explore the capabilities and applications of RNNs with real-world anecdotes and insights into learning algorithms and expressive power.

ggibson
Download Presentation

Introduction to Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science and Informatics and Complex and Adaptive Systems Labs University College Dublin gianluca.pollastri@ucd.ie

  2. Credits • Geoffrey Hinton, University of Toronto. • borrowed some of his slides for “Neural Networks” and “Computation in Neural Networks” courses. • Paolo Frasconi, University of Florence. • This guy taught me Neural Networks in the first place (*and* I borrowed some of his slides too!).

  3. Recurrent Neural Networks (RNN) • One of the earliest versions: Jeffrey Elman, 1990, Cognitive Science. • Problem: it isn’t easy to represent time with Feedforward Neural Nets: usually time is represented with space. • Attempt to design networks with memory.

  4. RNNs • The idea is having discrete time steps, and considering the hidden layer at time t-1 as an input at time t. • This effectively removes cycles: we can model the network using an FFNN, and model memory explicitly.

  5. It Xt Ot d d = delay element

  6. BPTT • BackPropagation Through Time. • If Ot is the output at time t, It the input at time t, and Xt the memory (hidden) at time t, we can model the dependencies as follows:

  7. BPTT • We can model both f() and g() with (possibly multilayered) networks. • We can transform the recurrent network by unrolling it in time. • Backpropagation works on any DAG. An RNN becomes one once it’s unrolled.

  8. It Xt Ot d d = delay element

  9. It Xt Ot It+1 Xt+1 Ot+1 It-1 Xt-1 Ot-1 It+2 Xt+2 Ot+2 It-2 Xt-2 Ot-2

  10. gradient in BPTT • GRADIENT(I,O,T) { • # I=inputs, O=outputs, T=targets • T := size(O); • X0 := 0; • for t := 1..T • Xt := f( Xt-1 , It ); • for t := 1..T { • Ot := g( Xt , It ); • g.gradient( Ot - Tt ); • δt = g.deltas( Ot - Tt ); • } • for t := T..1 • f.gradient(δt ); • δt-1 += f.deltas(δt ); • }

  11. It Xt Ot It+1 Xt+1 Ot+1 It-1 Xt-1 Ot-1 It+2 Xt+2 Ot+2 It-2 Xt-2 Ot-2

  12. It Xt Ot It+1 Xt+1 Ot+1 It-1 Xt-1 Ot-1 It+2 Xt+2 Ot+2 It-2 Xt-2 Ot-2

  13. It Xt Ot It+1 Xt+1 Ot+1 It-1 Xt-1 Ot-1 It+2 Xt+2 Ot+2 It-2 Ot-2 Xt-2

  14. It Xt Ot It+1 Xt+1 Ot+1 It-1 Ot-1 It+2 Xt+2 Ot+2 It-2 Ot-2 Xt-2 Xt-1

  15. It Ot It+1 Xt+1 Ot+1 It-1 It+2 Xt+2 Ot+2 It-2 Ot-2 Ot-1 Xt-2 Xt-1 Xt

  16. It It+1 Ot+1 It-1 It+2 Xt+2 Ot+2 It-2 Ot-2 Ot-1 Ot Xt-2 Xt-1 Xt Xt+1

  17. It It+1 It-1 It+2 It-2 Ot-2 Ot-1 Ot Ot+1 Ot+2 Xt-2 Xt-1 Xt Xt+1 Xt+2

  18. It It+1 It-1 It+2 It-2 Ot-2 Ot-1 Ot Ot+1 Ot+2 Xt-2 Xt-1 Xt Xt+1 Xt+2

  19. It It+1 It-1 It+2 It-2 Ot-2 Ot-1 Ot Ot+1 Ot+2 Xt-2 Xt-1 Xt Xt+1 Xt+2

  20. It It+1 It-1 It+2 It-2 Ot-2 Ot-1 Ot Ot+1 Ot+2 Xt-2 Xt-1 Xt Xt+1 Xt+2

  21. It It+1 It-1 It+2 It-2 Ot-2 Ot-1 Ot Ot+1 Ot+2 Xt-2 Xt-1 Xt Xt+1 Xt+2

  22. It It+1 It-1 It+2 It-2 Ot-2 Ot-1 Ot Ot+1 Ot+2 Xt-2 Xt-1 Xt Xt+1 Xt+2

  23. It It+1 It-1 It+2 It-2 Ot-2 Ot-1 Ot Ot+1 Ot+2 Xt-2 Xt-1 Xt Xt+1 Xt+2

  24. What I will talk about • Neurons • Multi-Layered Neural Networks: • Basic learning algorithm • Expressive power • Classification • How can we *actually* train Neural Networks: • Speeding up training • Learning just right (not too little, not too much) • Figuring out you got it right • Feed-back networks? • Anecdotes on real feed-back networks (Hopfield Nets, Boltzmann Machines) • Recurrent Neural Networks • Bidirectional RNN • 2D-RNN • Concluding remarks

  25. Bidirectional Recurrent Neural Networks (BRNN)

  26. BRNN Ft = ( Ft-1 , Ut ) Bt = ( Bt+1 , Ut ) Yt = ( Ft , Bt , Ut ) • () () ed () are realised with NN  • (), () and () are independent from t: stationary

  27. BRNN Ft = ( Ft-1 , Ut ) Bt = ( Bt+1 , Ut ) Yt = ( Ft , Bt , Ut ) • () () ed () are realised with NN  • (), () and () are independent from t: stationary

  28. BRNN Ft = ( Ft-1 , Ut ) Bt = ( Bt+1 , Ut ) Yt = ( Ft , Bt , Ut ) • () () ed () are realised with NN  • (), () and () are independent from t: stationary

  29. BRNN Ft = ( Ft-1 , Ut ) Bt = ( Bt+1 , Ut ) Yt = ( Ft , Bt , Ut ) • () () ed () are realised with NN  • (), () and () are independent from t: stationary

  30. Inference in BRNNs • FORWARD(U) { • T  size(U); • F0  BT+1  0; • for t  1..T • Ft = ( Ft-1 , Ut ); • for t  T..1 • Bt = ( Bt+1 , Ut ); • for t  1..T • Yt = ( Ft , Bt , Ut ); • return Y; • }

  31. GRADIENT(U,Y) { T  size(U); F0  BT+1  0; for t  1..T Ft = ( Ft-1 , Ut ); for t  T..1 Bt = ( Bt+1 , Ut ); for t  1..T { Yt = ( Ft , Bt , Ut ); [δFt, δBt] = .backprop&gradient( Yt - Yt ); } for t  T..1 δFt-1 += .backprop&gradient(δFt ); for t  1..T δBt+1 += .backprop&gradient(δBt ); } Learning in BRNNs

  32. What I will talk about • Neurons • Multi-Layered Neural Networks: • Basic learning algorithm • Expressive power • Classification • How can we *actually* train Neural Networks: • Speeding up training • Learning just right (not too little, not too much) • Figuring out you got it right • Feed-back networks? • Anecdotes on real feed-back networks (Hopfield Nets, Boltzmann Machines) • Recurrent Neural Networks • Bidirectional RNN • 2D-RNN • Concluding remarks

  33. 2D RNNs Pollastri & Baldi 2002, Bioinformatics Baldi & Pollastri 2003, JMLR

  34. 2D RNNs

  35. 2D RNNs

  36. 2D RNNs

  37. 2D RNNs

  38. 2D RNNs

  39. 2D RNNs

More Related