900 likes | 1.06k Views
Tópicos Especiais em Aprendizagem. Reinaldo Bianchi Centro Universitário da FEI 2010. 3a . Aula. Parte A. Objetivos desta aula. Apresentar mais duas técnicas de Statistical Machine Learning : PCA . LDA e MLDA. Aula de hoje: Cap ítulos 3 e 4 do Hastie .
E N D
Tópicos Especiais em Aprendizagem Reinaldo Bianchi Centro Universitário da FEI 2010
3a. Aula Parte A
Objetivos desta aula • Apresentar mais duas técnicas de Statistical Machine Learning: • PCA. • LDA e MLDA. • Aula de hoje: • Capítulos 3 e 4 do Hastie. • A Tutorialon Principal ComponentsAnalysis - Lindsay I Smith. • Wikipedia.
Introduction • Karl Pearson, 1901.
What is PCA? • Principal Component Analysis, or simply PCA, is a feature extraction multivariate statistical procedure concerned with explaining the covariance structure of a set of variables through a small number of linear combinations of these variables.
PCA Principal componentanalysis (PCA) involves a mathematicalprocedurethattransforms a numberofpossiblycorrelated variables into a smallernumberofuncorrelated variables called principal components.
PCA Thefirst principal componentaccountsfor as muchofthevariability in the data as possible. Eachsucceedingcomponentaccountsfor as muchoftheremainingvariability as possible.
PCA general objectives are: • Data reduction • Feature selection
Introduction • In algebraic terms, principal components are particular linear combinations of the original variables that seek a projection that best represent the data in a least-square sense.
Introduction • Geometrically, these linear combinations represent the selection of a new coordinate system obtained by rotating the original one. • The new axes represent the directions with maximum variability of the sample data.
Geometric Idea The lenghts of fi are proportional to sqrt(li)
An estimative of the Variance • An Estimate of Variance - s2 The mean square error (MSE) provides the estimate of s2, and the notation s2 is also used.
Standard Deviation • An Estimate of Standard Deviations • To estimate s we take the square root ofs2. • The resulting s is called the standard error of the estimate.
Covariance • Thelasttwomeasureswehavelooked at are purely 1-dimensional. • Data sets likethiscould be: heightsofallthepeople in theroom, marksforthelastexam etc. • Howevermany data sets have more thanonedimension, andtheaimofthestatisticalanalysisofthese data sets isusuallytoseeifthereisanyrelationshipbetweenthedimensions.
Covariance Forexample, wemighthave as our data set boththeheightofallthestudents in a class, andthemarktheyreceivedforthatpaper. Wecouldthenperformstatisticalanalysistoseeiftheheightof a student has anyeffectontheirmark.
Covariance • Standard deviationandvarianceonlyoperateon 1 dimension: • Youcouldonlycalculatethestandarddeviationforeachdimensionofthe data set independentlyoftheotherdimensions. • However, itisusefultohave a similar measuretofind out how muchthedimensionsvaryfromthe mean withrespecttoeachother.
Covariance • Covarianceissuch a measure. • Covarianceisalwaysmeasuredbetween 2 dimensions. • Ifyoucalculatethecovariancebetweenonedimensionanditself, yougetthevariance. • So, ifyouhad a 3-dimensional data set thenyoucouldmeasurethecovariancebetweenallthedimensions.
The formula forcovariance The formula forcovarianceis: Seethatitis similar to:
Thecovariancematrix A usefulwaytogetallthepossiblecovariancevaluesbetweenallthedifferentdimensionsistocalculatethemallandputthem in a matrix.
Thecovariancematrix A usefulwaytogetallthepossiblecovariancevaluesbetweenallthedifferentdimensionsistocalculatethemallandputthem in a matrix.
EigenvaluesandEigenvectors(Autovalores e Autovetores) A brief review on Linear Algebra
Definition: eigenvectors/eigenvalues • Let l be an eigenvalue of A. Then there exists a vector xsuch that • The vector x is called an eigenvector of A associated with the eigenvaluel. Ordinarily we normalise x so that it has length one, that is,
Eingenvalues and eigenvectors In general, a matrixactson a vector by changingbothitsmagnitudeanditsdirection. However, a matrix may actoncertainvectors by changingonlytheirmagnitude, andleavingtheirdirectionunchanged (orpossiblyreversingit).
Eingenvalues and eigenvectors • Thesevectors are theeigenvectorsofthematrix. • A matrixactsonaneigenvector by multiplyingitsmagnitude by a factor, whichis positive ifitsdirectionisunchangedandnegativeifitsdirectionisreversed. • This factor istheeigenvalueassociatedwiththateigenvector.
Eingenvalues and eigenvectors Eigenvectors Eigenvectors
Eingenvalues and eigenvectors X A Eigenvalue X Ax=lx
Eingenvalues and eigenvectors Q: Whichoneistheeigenvector, blue or red arrow?
Eingenvalues and eigenvectors A: Red, witheigenvalue = 1 (thereis no scaling).
Computing the eigenvalues and eigenvectors • Let A be an nxnmatrix. • The eigenvalues of A are defined as the roots of: where I is the nxnidentity matrix. • This equation is called the characteristic equation and has n roots.
ExampleofComputation Forexample, compute theeigenvectorsandeigenvalueforthematrix: Wehaveto use:
ExampleofComputation Thecharacteristicequationis: Whichgivesthefollowing :
ExampleofComputation Eigenvalues!!!! • Therootsofthisequation (thevaluesofλforwhichtheequationholds) are: • λ = 1 andλ = 3 • Havingfoundtheeigenvalues, itispossibletofindtheeigenvectors: • Thereisoneforeveryeigenvalue.
ExampleofComputation • Consideringfirsttheeigenvalueλ = 3, wehave: • Aftermatrix-multiplication: • Whichgivesthefollowingequations:
Example of Computation • Boththeequations reduce tothe single linear equationx = y. • Tofindaneigenvector, we are free tochooseanyvalueforx (except 0) • By pickingx=1 andsetting y=x, wefindaneigenvectorwitheigenvalue 3 to be:
Testingtheresult We can confirmthisisaneigenvectorwitheigenvalue 3 by checkingtheactionofthematrixonthis vector:
Example of Computation • Considering the eigenvalue λ = 1, we have: • then • Which gives the following equations: • Both the equations reduce x = -y, and eigenvector:
Proprieties of Eigenvalues and Eigenvectors • Eigenvectors can only be foundforsquare matrices. • Noteverysquarematrix has eigenvectors. • Givenannxnmatrixthatdoeshaveeigenvectors, there are nofthem. • Given a 3 x 3 matrix, there are 3 eigenvectors.
Proprieties: Linear independence • Alltheeigenvectorsof a matrix are orthogonal, i.e., Linear independent: • They are at rightanglestoeachother, no matter how manydimensionsyouhave. • You can expressthe data in termsofthese perpendicular eigenvectors, insteadofexpressingthem in termsofthexand y axes.
Computing PCA usingthecovariancemethod Fim do relembraréviver.
The PCA Method Getsome data. Subtractthe mean. Calculatethecovariancematrix. Calculatetheeigenvectorsandeigenvaluesofthecovariancematrix. Choosecomponentsandform a feature vector. Derive thenew data set.
Eigenvectors/values of covariance matrices • Let S be an nxn covariance matrix. • There is an orthogonal nxn matrix F whose columns are eigenvectors of S and a diagonal matrix L whose diagonal elements are the eigenvalues of S, such that
In other words… • The linear transformation given by FdiagonalisesS in the new coordinate system creating a set of new variables that are uncorrelated! • This linear transformation essentially finds the principal components of the covariance structure.
Geometric Idea The lenghts of fi are proportional to sqrt(li)
In short… • Calculate the covariance matrix S : • where xj is observation j and N is the number of observations • Find the eigenvectors/values (F, L) of S = FLFT • Eigenvectors: main directions • Eigenvalues: variances along the respective eigenvectors