1.19k likes | 1.2k Views
This chapter outlines the method of least squares for linear and nonpolynomial regression, including the use of orthogonal systems and Chebyshev polynomials. It explains how to fit data points to a line or a general function using the principle of least squares. Examples and algorithms are provided to illustrate the process.
E N D
CSE 551 Computational Methods 2018/2019 Fall Chapter 8
Outline Method of Least Squares Orthogonal Systems and Chebyshev Polynomials
References • W. Cheney, D Kincaid, Numerical Mathematics and Computing, 6ed, • Chapter 12
Method of Least Squares Linear Least Squares Linear Example Nonpolynomial Example Basis Functions {g0, g1, . . . , gn}
Linear Least Squares • In experimental, social, and behavioral sciences, • an experiment or survey: • produces - a mass of data. • m+1 points - graph
underlying function - linear • failure of points to fall precisely on a straight line • experimental error • determine correct function. • y = ax + b • what are coefficients a and b? • geometrically, What line most nearly passes through the eight points plotted?
a guess about correct values of a and b. • deciding on a specific line to represent the data • In general, the data points will not fall on the line y = ax +b. • by chance kth datum falls on the line, • If not - discrepancy or error of magnitude:
total absolute error for all m + 1 points: • function of a and b • choose a and b • the function • assumes its minimum value • example of l1approximation • can be solved - linear programming • calculus not work on this function • not generally differentiable
In practice, it is common to minimize a different error function of a and b: • suitable - statistical considerations. • errors - normal probability distribution • minimization of ϕ produces a best estimate of • a and b. • l2approximation • calculus used on Equation (2):
l1 and l2 approximations specific cases of the p norm: • vector x = [x1, x12, . . . , xn]T. • try to make ϕ(a, b) - minimum: • partial derivatives of ϕ with respect to a and b, necessary at the minimum.
Taking derivatives in (2): • pair of simultaneous linear equations • unknowns a and b • normal equations:
mk=01 = m + 1, number of data points. • set • system of Equations (3):
solve Gaussian elimination and obtain the following algorithm. • Cramer’s Rule • determinant of the coefficient matrix:
Linear Example • EXAMPLE 1 find the linear least-squares solution for the following table of values: • Plot the original data points and the line using a finer set of grid points.
Solution • The equations in Algorithm 1 leads to this system of two equations: • a = 0.4864 and b = −1.6589 • By Equation (3) obtain the value ϕ(a, b) = 10.7810.
determine equation of a line of the form y = ax + b • fits the data best in the least-squares sense • with four data points (xi , yi), • four equations yi= axi+ b for i = 1, 2, 3, 4
n general, solve a linear system • A m × n matrix and m > n • The solution coincides with the solution of the • normal equations • corresponds to minimizing ||Ax − b||22.
Nonpolynomial Example • method of least squares - not restricted • linear (first-degree) polynomials • or any specific functional form • fit a table of values (xk , yk), k = 0, 1, . . . ,m, by a function of the form: • unknowns : a, b, c.
set ∂ϕ/∂a = 0, ∂ϕ/∂b = 0, ∂ϕ/∂c = 0. • three norm equations:
Example 2 • Fit a function of the form y = a ln x + b cos x + cex to the following table values:
Using the table and the equations above, • obtain the 3 × 3 system: • a = −1.04103, b = −1.26132, and c = 0.03073. • has the required form and fits the table in the least-squares sense. value of ϕ(a, b, c) : 0.92557.
Basis Functions {g0, g1, . . . , gn} • principle of least squares • extended to general linear families of functions • Suppose the data - conform to a relationship: • the functions g0, g1, . . . , gn - basis functions known • held fixed • coefficients c0, c1, . . . , cnto be determined according to the principle of least squares.
define the expression: • ϕ(c0, c1, . . . , cn): sum of the squares of the errors • associated with each entry (xk , yk) • necessary conditions for the minimum: • n equations:
partial derivatives - Equation (7): • set equal to zero, • resulting equations rearranged: • normal equations serve to determine the best values of parameters c0, c1, . . . , cm
normal equations linear in ci; • can be solved by the method of Gaussian elimination • In practice, the normal equations may be difficult to solve • choosing the basis functions g0, g1, . . . , gn
First, • the set {g0, g1, . . . , gn} linearly independent. • no linear combination • ni=0cigican be the zero function • except trivial case when c0 = c1 = · · · = cn= 0. • Second, the functions g0, g1, . . . , gn • appropriate to the problem at hand. • Finally, choose a set of basis functions • well conditioned for numerical work.
Orthogonal Systems and Chebyshev Polynomials • Orthonormal Basis Functions {g0, g1, . . . , gn} • Smoothing Data: Polynomial Regression
Orthonormal Basis Functions {g0, g1, . . . , gn} • functions g0, g1, . . . gnchosen, • least-squares problem: • The set of all functions g • linear combinations of g0, g1, . . . , gn • vector space G.
The function that is being sought in the least-squares problem • an element of the vector space G. • functions g0, g1, . . . , gnbasis for G • the set not linearly dependent • a vector space - many different bases • differ drastically in their numerical properties
vector space G generated by that basis. • {g0, g1, . . . , gn} • Without changing G, :What basis for G should be chosen for numerical work? • present problem • numerical task - solve the normal equations:
nature of this system depends on the basis • {g0, g1, . . . , gn}. • these equations to be easily solved or to be capable of being accurately solved • The ideal situation - coefficient matrix in (1) • identity matrix • basis {g0, g1, . . . , gn} - property orthonormality:
(1) simplifies to: • no longer a system of equations to be solved • an explicit formula for the coefficients cj. • Under rather general conditions • the space G basis that is orthonormal • Gram-Schmidt process – obtain such a basis • some situations the effort of obtaining an orthonormal basis is justifies • simpler procedures
goal make eq (1) well disposed - numerical solution. • avoid any matrix of coefficients that involves the difficulties encountered in • connection with the Hilbert matrix • basis for the space G well chosen.
consider the space G • consists of all polynomials of degree n • natural n + 1 functions - basis for G: • Using this basis, • write a typical element of the space G: • almost always a poor choice for numerical work.
Chebyshev polynomials • suitably defined for the interval involved • good basis. • why monomials x jdo not form a good basis for • numerical work: • These functions are too much alike! • givenfunction g - express - linear combination of monomials • g(x) = j=0ncjxj, difficult to determine the coefficients cj precisely. • Chebyshev polynomials; they are quite different from one another.
For simplicity, assume that the points in our least-squares problem have the property • Chebyshev polynomials interval [−1, 1] • A recursive formula:
together with T0(x) = 1 and T1(x) = x, • formal definition of the Chebyshev polynomials. • Linear combinations of Chebyshev polynomials easy to evaluate • a special nested multiplication algorithm applies • describe procedure • consider an arbitrary linear combination of T0, T1, T2, . . . , Tn:
An algorithm to compute g(x) for any given x: • To see that this algorithm actually produces g(x), • write down the series for g, • shift some indices, • and use Formulas (2) and (3):
arrange data all the abscissas {xi} lie in the interval [−1, 1] • if the first few Chebyshev polynomials - basis for the polynomials, • normal equations reasonably well conditioned. • interpreted informally • Gaussian elimination with pivoting produces an accurate solution to the normal equations.
If original data not satisfy • min{xk} = −1 and max{xk} = 1 • lie instead in another interval [a, b], • change of variable: • variable z • traverses [−1, 1] as x traverses [a, b].
Outline of Algorithm • procedure, • produces a polynomial of degree (n + 1) • that best fits a given table of values • (xk , yk) (0 k m). • m is usually much greater than n.
details of step 4 are as follows: • Begin by introducing a double-subscripted variable: • The matrix T = (t jk) computed efficiently • recursive definition of the Chebyshev polynomials • (2),