1 / 36

WK1 - Introduction

WK1 - Introduction. CS 476: Networks of Neural Computation WK1 – Introduction Dr. Stathis Kasderidis Dept. of Computer Science University of Crete Spring Semester, 2009. Contents. Course structure and details Basic ideas of Neural Networks Historical development of Neural Networks

Download Presentation

WK1 - Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. WK1 - Introduction CS 476: Networks of Neural Computation WK1 – Introduction Dr. Stathis Kasderidis Dept. of Computer Science University of Crete Spring Semester, 2009

  2. Contents • Course structure and details • Basic ideas of Neural Networks • Historical development of Neural Networks • Types of learning • Optimisation techniques and the LMS method • Conclusions Contents

  3. Course Details • Duration: 13 weeks (2 Feb – 15 May 2009) • Lecturer: Stathis Kasderidis • E-mail: stathis@ics.forth.gr • Meetings: After arrangement through e-mail. • Assts: Farmaki, Fasoulakis • Hours: • Every Tue 11-1 am and Wed 11-1 am. • Laboratory at Fri 11-1 am. Course

  4. Course Timetable • WK1 (3/5 Feb): Introduction • WK2 (10/12 Feb): Perceptron • WK3 (17/19 Feb): Multi-layer Perceptron • WK4 (24/26 Oct): Radial Basis Networks • WK5 (3/5 Mar): Recurrent Networks • WK6 (10/12 Mar): Self-Organising Networks • WK7 (17/19 Mar): Hebbian Learning • WK8 (24/26 Mar): Hopfield Networks • WK9 (31/2 Apr): Principal Component Analysis Course

  5. Course Timetable (Cont) • WK10 (7/9 Apr): Support Vector Machines • WK11 (28/30 Apr): Stochastic Networks • WK12 (5/7 May): Student Projects’ Presentation • WK13 (12/14 May): Exams Preparation • Every week: • 3hrs Theory • 1hr Demonstration • 19 Mar 2009: Written mid-term exams (optional) Course

  6. Course Timetable (Cont) • Lab sessions will take place every Friday 11-1 am. In Lab sessions, you will be examined in written assignments and you can get help between assignments. • There will be four assignments during the term on the following dates: • Fri 6 Mar (Ass1 – Perceptron / MLP / RBF) • Fri 20 Mar (Ass2 – Recurrent / Self-organising) • Fri 3 Apr (Ass3 – Hebbian / Hopfield) • Fri 8 May (Ass4 – PCA/SVM/Stochastic)

  7. Course Structure • Final grade is divided: • Laboratory attendance (20%) • Obligatory! • Course project (40%) • Starts at WK2. Presentation at WK12. • Teams of 2-4 people depending on class size. Selection from a set of offered projects. • Theory. Best of: • Final Theory Exams (40%) or • Final Theory Exams (25%) + Mid-term exams (15%) Course

  8. Project Problems • Problems categories: • Time Series Prediction (Financial Series?) • Color Segmentation with Self-Organising Networks. • Robotic Arm control with Self-Organising Networks • Pattern Classification (Geometric Shapes) • Cognitive Modeling (ALCOVE model) Course

  9. Suggested Tools • Tools: • MATLAB (+ Neural Networks Toolbox). Can be slow in large problems! • TLearn: http://crl.ucsd.edu/innate/tlearn.html • Any C/C++ compiler • Avoid Java and other interpreted languages! Too slow! Course

  10. What are Neural Networks? • Models inspired by real nervous systems • They have a mathematical and computational formulation • Very general modelling tools • Different approach to Symbolic AI (Connectionism) • Many paradigms exist but based on common ideas • A type of graphical models • Usedin many scientific and technological areas, e.g. Basic Ideas

  11. What are Neural Networks? (Cont.) Basic Ideas

  12. What are Neural Networks? (Cont. 2) • NNs & Physics: e.g. Spin Glasses • NNs & Mathematics: e.g. Random Fields • NNs & Philosophy: e.g. Theory of Mind, Consciousness • NNs & Cognitive Science: e.g. Connectionist Models of High-Level Functions (Memory, Language, etc) • NNs & Engineering: e.g. Control, Hybrid Systems, A-Life • NNs & Neuroscience: e.g. Channel dynamics, Compartmental models Basic Ideas

  13. What are Neural Networks? (Cont. 3) • NNs & Finance: e.g. Agent-based models of markets, • NNs & Social Science: e.g. Artif. Society Basic Ideas

  14. General Characteristics I • How do they look like? Basic Ideas

  15. General Characteristics II • Node details: • Y=f(Act) • f is called Transfer function • Act=I Xi * Wi –B • B is called Bias • W are called Weights Basic Ideas

  16. General Characteristics III • Form of transfer function: Basic Ideas

  17. General Characteristics IV • Network Specification: • Number of neurons • Topology of connections (Recurrent, Feedforward, etc) • Transfer function(s) • Input types (representation: symbols, etc) • Output types (representation: as above) • Weight parameters, W • Other (weights initialisation, Cost function, training criteria, etc) Basic Ideas

  18. General Characteristics V • Processing Modes: • Recall • “Learning” Basic Ideas

  19. General Characteristics VI • Common properties of all Neural Networks: • Distributed representations • Graceful degradation due to damage • Noise robustness • Non-linear mappings • Generalisation and prototype extraction • Allow access of memory by contents • Can work with incomplete input Basic Ideas

  20. Historical Development of Neural Networks • History in brief: • McCulloch-Pitts, 1943: Digital Neurons • Hebb, 1949:Synaptic plasticity • Rosenblant, 1958: Perceptron • Minksy & Papert, 1969: Perceptron Critique • Kohonen, 1978: Self-Organising Maps • Hopfiled, 1982: Associative Memory • Rumelhart & McLelland, 1986: Back-Prop algorithm • Many people, 1985-today:EXPLOSION! History

  21. What is Learning in NN? Def: “Learning is a process by which the free parameters of neural network are adapted through a process of stimulation by the environment in which the network is embedded. The type of learning is determined by the manner in which the parameter changes take place” [Mendel & McClaren (1970)] Learning

  22. Learning Sequence • The network is stimulated by the environment; • The network undergoes changes in its free parameters as a result of this stimulation; • The network responds in a new way to the environment because of the changes that have occurred in its internal structure. Learning

  23. Learning Criteria • Sum squared error • Mean square error • X2 statistic • Mutual information • Entropy • Other (e.g. Dot product – ‘similarity’) Learning

  24. Learning Paradigms • Learning with a teacher (supervised learning) • Learning without a teacher • Reinforcement learning • Unsupervised learning (self-organisation) Learning

  25. Families of Learning Algorithms • Error-based learning • wkj(n) = h*ek(n)*xj(n) (Delta rule) • Memory-based learning (??) • 1-Nearest Neighbour • K-Nearest Neighbours • Hebbian learning • wkj(n) = h*yk(n)*xj(n) • wkj(n) =F(yk(n),xj(n)) (more general case) • Competitive learning • wij(n+1) = h*(xj(n)- wij(n)) Learning

  26. Families of Learning Algorithms II • Stochastic Networks • Boltzmann learning • wkj(n) = h*(kj+(n)-kj-(n)) • (kj* = avg corr of states of neurons i, j ) Learning

  27. Learning Tasks • Function approximation • Association • Auto-association • Hetero-association • Pattern recognition • Control • Filtering Learning

  28. Credit Assignment Problem • Def: It is the problem of providing credit or blame to states that lead to useful / harmful outcomes • Temporal Credit Assignment Problem: Find which actions in a period q=[t,t-T] lead to useful outcome at time t and credit these actions, I.e. • Outcome(t) – f  Actions(q) • Structural Credit Assignment Problem: Find which states at time t lead to useful actions at time t, I.e. • Actions(t) – g  State(t) Learning

  29. Statistical Nature of the Learning Process • Assume that a set of examples is given: • Assume that a statistical model of the generating process is given (regression equation): • Where X is a vector random variable (independent variable), D is scalar random variable (dependent) and  is a random variable with the following properties: Bias / Var

  30. Statistical Nature of the Learning Process II • The first property says that  has zero mean given any realisation of X • The second property says that  is uncorrelated with the regression function f(X) (principle of orthogonality) •  is called intrinsic error • Assume that the neural network describes an “approximation” to the regression function, which is: Bias / Var

  31. Statistical Nature of the Learning Process III • The weight vector w is obtained by minimising the cost function: • We can re-write this, using expectation operators, as: Bias / Var • … (after some algebra we get) ….

  32. Statistical Nature of the Learning Process IV • Thus to obtain w we need to optimise the function: • … (after some more algebra!) …. Bias / Var

  33. Statistical Nature of the Learning Process V • B(w) is called bias (or approximation error) • V(w) is called variance (or estimation error) • The last relation shows the bias-variance dilemma: • “We cannot minimise at the same time both • bias and variance for a finite set, T. Only • when N   both are becoming zero” • Bias measures the “goodness” of our functional form in approximating the true regression function f(x) • Variance measures the amount of information present in the data set T which is used for estimating F(x,w) Bias / Var

  34. Comments I • We should distinguish Artificial NN from bio-physicalneural models (e.g. Blue Brain Project); • Some NNs are Universal Approximators, e.g. feed-forward modles are based on the Kolmogorov Theorem • Can be combined with other methods, e.g. Neuro-Fuzzy Systems • Flexible modeling tools for: • Function approximation • Pattern Classification • Association • Other Conclusions

  35. Comments II • Advantages: • Distributed representation allows co-activation of categories • Graceful degradation • Robustness to noise • Automatic generalisation (of categories, etc) Conclusions

  36. Comments III • Disadvantages: • They cannot explain their function due to distributed representations • We cannot add existing knowledge to neural networks as rules • We cannot extract rules • Network parameters found by trial and error (in general case) Conclusions

More Related