4. Function Approximation Theory

Cybenko 88, Funahashi, Hornik 89 : One hidden layer is sufficient Sigmoid Nondecreasing, Continuous/Discrete Unit Hypercube, = Space of Continous Functions. Given any ,  > 0, there exist such that Ex. MLP for all 4. Function Approximation Theory Ref. Haykin (1) Universal Approximation Theorem(Existence) (.) = nonconstant, bounded, monotone increasing, continuous

Barron : Large M for best approximation Small M/N for better empirical fit M may be small if Cf is small. Conflict (2) Mean Squared Error R : ability to generalize / N : # of Training Examples M: # hidden neurons / p : input dimension = regularity or smoothness of f ` , Given mean square error Scalability: The MLP Error is independent of the size of the input space and scales as inverse of the #hidden nodes for large training sets. MLPs are suited to large input Dim. Problems.

(3) Practical Considerations Funahashi : 2 hidden layers are more manageable, In a 1 layer, interaction between neurons makes it difficult to learn. (Hidden= salient feature detector) Sontag : 2 Hidden layers are sufficient to find f -1 from any f ( Inverse problem ) x(n) = f (x(n-1), u(n)) u(n) = f -1(x(n-1), x(n))

u(n) x(n-1) x(n) yi xi Example : Robot Dynamics and Kinematics • Dynamics: x = state u = joint torque • Kinematics: x = Cartesian Position, u = Joint Position (4) Network Jacobian: If the NN has learned the function, then it has learned its Jacobian, too.

Students’ Questions 1. Can the FF net learning take place concurrently (or synchronously) ? 2. Is M = Cf ? 3. Mathematical meaning of the step size, η ? 4. Why do we use the mean square error instead of the first, third order, etc . ? 5. What if more than 2 hidden layers are used ? Any problem then ?

6. BP becomes simple through Jacobian, right ? 7. What do you mean by dividing the input space into local regions ? 8. Where else is Jacobian used, other than in kinematics ? 9. The error can be defined for the global feature by comparing against the desired output. Does each local feature also need a desired output ?

In BackProp, In Network Jacobian Change Error function from EP to Ej Apply BP to xi as weight 1 xi yj NN

4. Function Approximation Theory

4. Function Approximation Theory

Presentation Transcript

Function Approximation

CHAPTER 8. Approximation Theory

Function approximation

Supervised Function Approximation

POMDPs: 5 Reward Shaping: 4 Intrinsic RL: 4 Function Approximation: 3

Indexing For Function Approximation

Lecture 5I Function Approximation

Lecture 5 II Function Approximation

Data driven function approximation

Reflectance Function Approximation

Approximation Theory

Approximation Theory and Computational Geometry

Lecture 4. Learning (I): Approximation Theory

Function Approximation

Chapter 8 Approximation Theory

Function Approximation

Function approximation: Fourier, Chebyshev, Lagrange

Filter Approximation Theory

Function Approximation