360 likes | 473 Views
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems. Lecture 20: Least Squares, State Estimation, QR Factorization, SVD. Prof. Tom Overbye Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign overbye@illinois.edu. Announcements.
E N D
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Lecture 20: Least Squares, State Estimation, QR Factorization, SVD Prof. Tom Overbye Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign overbye@illinois.edu
Announcements • HW 6 is due Today • HW 7 is due Thursday, November 14
The Least Squares Problem • With this background we proceed to the typical schemes in use for solving least squares problems, all along paying adequate attention to the numerical aspects of the solution approach • If the matrix is full, then often the best solution approach is to use a singular value decomposition (SVD), to form a matrix known as the pseudo-inverse of the matrix • We'll cover this later after first considering the sparse problem • We first review some fundamental building blocks and then present the key results useful for the sparse matrices common in state estimation
Problems and Solutions • Researchers tend to be grouped into two broad categories • Those with problems needing solutions • Those with solutions needing problems • Power systems tends to fall in the first category, though there are certainly some in the second • A danger of having a solution searching for problems, is when you have a hammer everything looks like a nail • In class material can either be presented as solution then problem or problem then solution • In real life it is often problem with ? for the solution
Power System State Estimation • Power system state estimation (SE) is covered in ECE 573, so we'll just touch on it here; it is a key least squares application so it helps to understand a problem as we've been presenting a solution • Overall goal is to come up with a power flow model for the present "state" of the power system based on the actual system measurements • SE assumes the topology and parameters of the transmission network are mostly known • Measurements come from SCADA, and increasingly, from PMUs
Power System State Estimation • Good introductory reference is Power Generation, Operation and Control by Wood, Wollenberg and Sheble, 3rd Edition out on Nov 16, 2013 • Problem can be formulated in a nonlinear, weighted least squares form asWhere J(x) is the cost function, x are the state variables (primarily bus voltage magnitudes and angles), zi are the m measurements, f(x) relates the states to the measurements and i is the assumed standard deviation
Measurement Example • Assume we measure the real and reactive power flowing into one end of a transmission line; then the zi-fi(x) functions for these two are • Two measurements for four unknowns • Other measurements, such as the flow at the other end, and voltage magnitudes, add redundancy
Assumed Error • Hence the goal is to decrease the error between the measurements and the assumed model states x • The iterm weighs the various measurements, recognizing that they can have vastly different assumed errors • Measurement error is assumed Gaussian; whether it is or not is another question; outlier (bad measurements) are often removed
SE Iterative Solution Algorithm • Solved by sequential linearization • Calculate the gradient of J(x)
SE Iterative Solution Algorithm • Use Newton's method to solve for x for which the gradient is zero This is exactly theleast squaresform developedearlier with HTR-1H an n by nmatrix. This couldbe solved withGaussianelimination, but this isn't preferredbecause it is often ill-conditioned
Example: Two Bus Case • Assume a two bus case with a generator supplying a load through a single line with x=0.1 pu. Assume measurements of the p/q flow on both ends of the line (into line positive), and the voltage magnitude at both the generator and the load end. So B12 = B21=10.0 We need to assume a referenceangle unless directly measuring q
Example: Two Bus Case • Let We assume an angle reference of q1=0
Example: Two Bus Case • With a flat start guess we get
QR Factorization • Preferred, since it handles ill-conditioned matrices • Can be used with sparse matrices • We will first split the R-1 matrix • QR factorization represents the m by n H' matrix aswith Q an m by m orthonormal matrix and U an upper triangular matrix (most books use Q R but we use U to avoid confusion with the previous R)
QR Factorization • We then have • But since Q is an orthonormal matrix, • Hence we have • And
QR Factorization • Next issue we discuss the QR factorization algorithm • When factored the R matrix (we're using the U notation) will be an m by n upper triangular matrix • Several methods are available including the Householder method and the Givens Method • Givens is preferred when dealing with sparse matrices • All good reference is Gene H. Golub and Charles F. Van Loan, “Matrix Computations,” second edition, Johns Hopkins University Press, 1989.
Givens Algorithm • The Givens algorithm works by pre-multiplying the desired matrix (Ahere) by a series of matrices and their transposes, starting with G1G1T • Gk is the identity matrix modified to have two non-ones on the diagonals cos(q), and two non-zero off diagonal elements, one set to -s=-sin(q), and one to s=sin(q) • Each Gk is an orthonormal matrix, so pre-multiplying by GkGkT= I ensures the final product is equal to A • Gkvalues are chosen to zero out particular elements in the lower triangle of A
Givens Algorithm • Algorithm proceeds column by column, sequentially zeroing out elements in the lower triangle of A, starting at the bottom of each column
Givens Algorithm • To zero out element A[i,j], with i > j we first solve with a=A[i-1,j], b= A[i,j] • A numerically safe algorithm isIf b=0 then c=1, s=0 // i.e, no rotation is neededElse If then Else
Givens G Matrix • The orthogonal G(i,j,q) matrix is • Premultiplicationby G(i,j,q)T is a rotation by q radians in the (i,j) coordinate plane i j
Small Givens Example • Let • First we zero out A[2,1], a=1, b=2 giving s= 0.894, c=-0.4472
Small Givens Example • Next zero out A[3,2] with a=1.7889, b=1, giving c= -0.8729, s=0.4880 • Final solution is
Start of Givens for SE Example • Starting with the H matrix we get • To zero out H'[5,1]=1 we haveb=100, a=-1000, givingc=0.995, s=0.0995
Start of Givens for SE Example • Which gives • The next rotation would be to zero out element [4,1], continuing until all the elements in the lower triangle have been reduced
Givens Comments • For a full matrix, Givens is O(mn2) since each element in the lower triangle needs to be zeroed O(nm), and each operation is O(n) • Computation can be drastically reduced for a sparse matrix since we only need to zero out the elements that are initially non-zero, and any that become non-zero (i.e., the fills) • Also, for each multiply we only need to deal with the nonzeros in the impacted row • Givens rotation is commonly used to solve the SE
Singular Value Decomposition • An extremely useful matrix analysis technique is the singular value decomposition (SVD), which takes an m by n real matrix A (m >= n) and represents it as where U is an m by n orthogonal matrix (such that UTU = I, S is an n by n diagonal matrix whose elements are the singular values of A, and V is an n by n orthogonal matrix • Note, there is an other formulation with U as m by m, and S as m by n • Computational order is O(n2m); ok if n is small
Matrix Singular Values • The singular values of a matrix A are the square roots of the eignenvalues of ATA • The singular values are real, nonnegative numbers, usually listed in decreasing order • Each singular values s satisfieswhere u (dimension m) and v (dimension n) are both unit length and called respectively the left-singular and right-singular vectors for s
SVD Applications • SVD applications come from the property that A can be written aswhere each one of the matrices is known as a mode • More of the essence of the matrix is contained in the modes associated with the larger singular values • An immediate application is data compression in which A represents an image; often a quite good representation of the image is available from just a small percentage of the modes
SVD Image Compression Example Image source http://fourier.eng.hmc.edu/e161/lectures/svdcompression.html
SVD Applications • Another application is removing noise. If the columns of A are signals, since noise often corresponds to the smaller singular values, then noise can be removed by taking the SVD, and reconstructing A without these modes • This is also a data compression method • Another application is principal component analysis (PCA) in which the idea is to take a data set with a number of variables, and reduce the data and determine the data associations • The principal components correspond to the largest singular values when data is appropriately normalized
Pseudoinverse of a Matrix • The pseudoinverse of a matrix generalizes concept of a matrix inverse to an m by n matrix, in which m >= n • Specifically talking about a Moore-Penrose Matrix Inverse • Notation for the pseudoinverse of A is A+ • Satisfies AA+A = A • If A is a square matrix, then A+= A-1 • Quite useful for solving the least squares problem since the least squares solution of Ax = b is x =A+ b
Pseudoinverse and SVD • Pseudoinverse can be directly determined from the SVDin which S+ is formed by replacing the non-zero diagonal elements by its inverse, and leaving the zeros • Numerically small values in Sare assumed zero • V is n by n • S+ is n by n • UT is n by m • A+ is therefore n by m Computationally doingthe SVD dominates
Simple Least Squares Example • Assume we which to fix a line (mx + b = y) to three data points: (1,1), (2,4), (6,4) • Two unknowns, m and b; hence x = [m b]T • Setup in form of Ax = b
Simple Least Squares Example • Doing the SVD • Computing the pseudoinverse
Simple Least Squares Example • Computing x = [m b]T gives • With the pseudoinverse approach we immediately see the sensitivity of the elements of x to the elements of b