1 / 78

Scientific Computing Chapter 3 - Linear Least squares

Scientific Computing Chapter 3 - Linear Least squares. Guanghua tan guanghuatan@gmail.com School of computer and communication, HNU. Outline. Least Squares Data Fitting. Least Squares Data Fitting Existence, Uniqueness, and Conditioning Solving Linear Least Squares Problems.

pekelo
Download Presentation

Scientific Computing Chapter 3 - Linear Least squares

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scientific ComputingChapter 3 - Linear Least squares Guanghua tan guanghuatan@gmail.com School of computer and communication, HNU

  2. Outline • Least Squares Data Fitting • Least Squares Data Fitting • Existence, Uniqueness, and Conditioning • Solving Linear Least Squares Problems

  3. Method of Least Squares • Measurement errors are inevitable in observational and experimental sciences • Errors can be smoothed out by averaging over many cases, i.e., taking more measurements than are strictly necessary to determine parameters of system • Resulting system is over-determined, so usually there is no exact solution

  4. Example: Fitting a plane • A plane can be determined by 3 points • Scan points may contain many points • Plane equation

  5. Linear Least Squares • For linear problems, we obtain over-determined linear system Ax = b, with m×n matrix A, m > n • System is better written , since equality is not exactly satisfiable when m>n • Least squares solution x minimizes squared Euclidean norm of residual vector

  6. Data Fitting • Given m data points (ti, yi), find n-vector x of parameters that gives “best fit” to model function f(t, x) • Problem is linear if function f is linear in components of x, where functions depend only on t

  7. Data Fitting • Polynomial fitting is linear, since polynomial linear in coefficients, though nonlinear in independent variable t • Fitting sum of exponentials is example of nonlinear problem • For now, we will consider only linear least squares problems

  8. Example: Data Fitting • Fitting quadratic polynomial to five data points gives linear squares problem

  9. Example: Data Fitting • For data over-determined 5×3 linear system is Solution, we will see how to compute later, is So approximating polynomial is

  10. Outline • Least Squares Data Fitting • Existence, Uniqueness, and Conditioning • Solving Linear Least Squares Problems

  11. Existence and Uniqueness • Linear least squares problem always has solution • Solution is unique, if and only if, columns of A are linearly independent, i.e., rank(A) = n, where A is m × n • If rank(A) < n, then A is rank-deficient, and solution of linear least squares problem is not unique • For now we assume A has full column rank n

  12. Normal Equations Necessary condition for a minimum: • To minimize squared Euclidean norm of residual vector take derivative with respect to x and set it to 0 which reduces to n×n linear system of normal equations

  13. Geometric Interpretation • Vector v1 and v2 are orthogonal if their product is zero, • Space spanned by columns of m×n matrix A, , is of dimension at most n • if m>n, b generally does not lie in span(A), so there is no exact solution to Ax = b

  14. Geometric Interpretation • Vector y = Ax in span(A) closest to b in 2-norm occurs when residual r = b – Ax isorthogonalto span(A) again giving system of normal equations • The solution to the least squares problem is

  15. Pseudo-inverse and Condition Number • Non-square m×n matrix A has no inverse in usual sense • If rank(A) = n, pseudo-inverse is defined by and condition number by • By convention,

  16. Condition Number • Just as condition number of square matrix measures closeness to singularity, condition number of rectangular matrix measures closeness to rank deficiency • Least squares solution of is given by

  17. Sensitivity and Conditioning • Sensitivity of least squares solution to depends on b as well as A • Define angle θ between b and y = Ax by • Bound on perturbation ∆x in solution x due to perturbation ∆b in b is given by

  18. Sensitivity and Conditioning • Similarly, for perturbation E in matrix A, • Condition number of least squares solution is about cond(A) if residual is small, but can be squared or arbitrarily worse for large residual

  19. Outline • Least Squares Data Fitting • Existence, Uniqueness, and Conditioning • Solving Linear Least Squares Problems

  20. Normal Equations Method • If m×n matrix A has rank n, then symmetric n×n matrix ATA is positive definite, so its Cholesky factorization can be used to obtain solution x to system of normal equations which has same solution as linear squares problem • Normal equations method involves transformations rectangular square triangular

  21. Example: Normal Equations Method • For polynomial data-fitting example given previously, normal equations method gives

  22. Example, continued • Cholesky factorization of symmetric positive definite matrix gives • Solving lower triangular system Lz = ATbby forward-substitution gives • Solving upper triangular system LTx = z by back-substitution gives

  23. Shortcomings of Normal Equations • Information can be lost in forming ATA and ATb • For example, take where is positive number smaller than • Then in floating-point arithmetic which is singular

  24. Shortcomings of Normal Equations • Sensitivity of solution is also worsened, since

  25. Augmented System Method • Definition of residual together with orthogonality requirement give (m+n)×(m+n) augmented system • Augmented system is symmetric but not positive definite, is larger than original system, and requires storing two copies of A • But it allows greater freedom in choosing pivots in computing LU factorization

  26. Augmented System • Introducing scaling parameter gives system which allows control over relative weights of two subsystems in choosing pivots • Reasonable rule of thumb is to take • Augmented system is sometimes useful, but is far from ideal in work and storage required

  27. Orthogonal Transformation • We seek alternative method that avoids numerical difficulties of normal equations • We need numerically robust transformation that produces easier problem without changing solution • What kind of transformation leaves least squares solution unchanged?

  28. Orthogonal Transformations • Square matrix Q is orthogonal if • Multiplication of vector by orthogonal matrix preserves Euclidean norm • Thus, multiplying both sides of least squares problem by orthogonal matrix does not change its solution

  29. Triangular Least Squares • As with square linear systems, suitable target in simplifying least squares problems is triangular form • Upper triangular over-determined (m>n) least squares problem has form where R is nxn upper triangular and b is partitioned similarly • Residual is

  30. Triangular Least Squares • We have no control over second term, , but first term becomes zero if x satisfies n×n triangular system which can be solved by back-substitution • Resulting x is least squares solution, and minimum sum of squares is

  31. Triangular Least Squares • So our strategy is to transform general least squares problem to triangular form using orthogonal transformation so that least squares solution is preserved

  32. QR Factorization • Given m×nmatrix A, with m>n, we seekm×morthogonal matrix Q such that where R is n×nand upper triangular • Linear least squares problem is then transformed into triangular least squares problem which has same solution, since

  33. Orthogonal Bases • If we partition m×morthogonal matrix Q = [Q1Q2], where Q1 is m×n, then is called reduced QR factorization of A • Columns of Q1 are orthonomal basis for span(A) • Solution to least squares problem is given by solution to square system

  34. Computing QR Factorization • To compute QR factorization of m×nmatrix A, with m>n, we annihilate subdiagonal entries of successive columns of A, eventually reaching upper triangular form • Similar to LU factorization by Gaussian elimination, but use orthogonal transformations instead of elementary elimination matrices

  35. Computing QR Factorization • Possible methods include • Householder transformations • Givens rotations • Gram-Schmidt orthogonalization

  36. Householder Transformations • Householder transformation has form for nonzero vector v • H is orthogonal and symmetric: • Given vector a, we want to choose v so that

  37. Householder transformation

  38. Householder Transformation • Substituting into formula for H, we can take And , with sign chosen to avoid cancellation, i.e.

  39. Example: Householder Transformation • If , then we take where • Since a1 is positive, we choose negative sign for to avoid cancellation, so

  40. Example: Householder Transformation • To confirm that transformation works,

  41. Householder QR Factorization • To compute QR factorization of A, use householder transformation to annihilate sub-diagonal entries of each successive column • Each Householder transformation is applied to entire matrix, but does not affect prior columns, so zeros are preserved • Process just described produces factorization where R is n×n and upper triangular

  42. Householder QR Factorization • If , then • In applying Householder transformation H to arbitrary vector u which is much cheaper than general matrix-vector multiplication and requires only vector v, not full matrix H

  43. Implementation of Householder QR Factorization • Pseudo description Fork = 1 ton {loop over columns} { compute Householder vector for current col} ifβk = 0 then { skip current column if continue with next k it’s already zero } forj = kton { apply transformation to remaining sub-matrix } end end

  44. Householder QR Factorization • To preserve solution of linear least squares problem, right-hand side b is transformed by same sequence of Householder transformations • Then solve triangular least squares problem • For solving linear least squares problem, product Q of Householder transformations need not be formed explicitly

  45. Householder QR Factorization • R can be stored in upper triangle of array initially containing A • Householder vectors v can be stored in (now zero) lower triangular portion of A (almost) • Householder transformations most easily applied in this form anyway

  46. Example: Householder QR Factorization • For polynomial data fitting example given previously, with • Householder vector v1 for annihilating subdiagonal entries of first column of A is

  47. Example, continued • Applying resulting Householder transformation H1 yields transformed matrix and right-hand side • Householder vector v2 for annihilating subdiagonal entries of second column of H1A is

  48. Example, continued • Applying resulting Householder transformation H2 yields • Householder vector v3 for annihilating subdiagonal entries of third column of H2H1A

  49. Example, continued • Applying resulting Householder transformation H3 yields • Now solve upper triangular system by back-substitution to obtain

  50. Givens rotations • Given rotations introduce zeros one at a time • Given vector , choose scalars c and s so that • Previous equation can be rewritten

More Related