1 / 75

Tópicos Especiais em Aprendizagem

Tópicos Especiais em Aprendizagem. Reinaldo Bianchi Centro Universitário da FEI 2012. 2 a . Aula. Parte A. Objetivos desta aula. Apresentar os conceitos de Statistical Machine Learning Continuação de Regressão. Métodos de Validação e Seleção. Aula de hoje:

aimon
Download Presentation

Tópicos Especiais em Aprendizagem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tópicos Especiais em Aprendizagem Reinaldo Bianchi Centro Universitário da FEI 2012

  2. 2a. Aula Parte A

  3. Objetivos desta aula • Apresentar os conceitos de Statistical Machine Learning • Continuação de Regressão. • Métodos de Validação e Seleção. • Aula de hoje: • Capítulos 3 e 7 do Hastie. • Wikipedia e Matlab Help

  4. Aula passada • Vimos: • Conceitos de Aprendizado de Máquina. • Statistical Machine Learning: • Predição, Regressão e Classificação • MetodosLeastMeanSquare e NearestNeighbour

  5. Variable TypesandTerminology • In thestatisticalliteraturetheinputsare oftencalledthepredictors, inputs, and more classicallytheindependent variables. • In thepatternrecognitionliteraturethetermfeaturesispreferred, whichwe use as well. • Theoutputsare calledthe responses, orclassicallythedependent variables.

  6. Namingconventionforthepredictiontask • Thedistinction in output type has ledto a namingconventionforthepredictiontasks: • Regressionwhenwepredictquantitativeoutputs. • Classificationwhenwepredictqualitativeoutputs. • Both can be viewed as a task in functionapproximation.

  7. Examplesof SML problems ProstateCancer StudybyStameyet al. (1989) thatexaminedthecorrelationbetweenthelevelofprostatespecificantigen (PSA) and a numberofclinicalmeasures. Thegoal is to predictthelogof PSA (lpsa) from a numberofmeasurements. • Regressionproblem

  8. Examplesofsupervisedlearningproblems • Classificationproblem

  9. Linear ModelsandLeastSquares The linear model has been a mainstayofstatisticsforthepast 30 yearsandremainsoneofitsmostimportanttools. Given a vector ofinputs: wepredictthe output Y viathemodel:

  10. Linear Models Thetermistheintercept, alsoknown as thebias in machinelearning. Oftenitisconvenienttoincludetheconstant variable 1 in X, include in the vector ofcoefficients , andthenwritethe linear model in vector form as aninnerproduct:

  11. Fitting the data: Least Squares • How do wefitthe linear modelto a set of training data? • by far themost popular isthemethodofleastsquares. • Pick thecoefficientsβtominimizetheResidual SumofSquares:

  12. Fitting the data: Least Squares • AssumingthatX has full columnrank, we set thefirstderivativetozero: • IfXTXisnonsingular(invertible), thentheuniquesolutionisgiven by:

  13. Example: height x shoe size • We wanted to explore the relationship between a person’s height and their shoe size. • We asked to individuals their height and corresponding shoe size. • We believe that a persons shoe size depends upon their height. • The height is independent variable x. • Shoe size is the dependent variable, y.

  14. Scatter Plot with Trend Line

  15. Linear ModelsandLeastSquares: Regression Using the learned parameters βone can do compute new outputs via regression. At anarbitraryinputx0thepredictionis: Intuitively, itseemsthatwe do notneed a verylarge data set tofitsuch a model.

  16. Example Height x Shoe Size • Thus if a person is 5 feet tall (i.e. x=60 inches), then I would estimate their shoe size to be:

  17. Regression using LMS

  18. Other Linear methods FromMatlabHelp http://www.mathworks.com/help/toolbox/curvefit/bq_5ka6-1.html

  19. Linear methods can approximatepolinomials • Linear methods can alsoapproximate polinomial curves. = Intercept = Linear coefficient = Quadraticcoefficient

  20. Dataset: US Census x = 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 y = 75.9950 91.9720 105.7110 123.2030 131.6690 150.6970 179.3230 203.2120 226.5050 249.6330 281.4220

  21. Dataset: US Census Trytopredictthe US population in theyear 2010

  22. Linear x = (1900:10:2000)' y = [ 75.995 91.972 105.711 123.203 131.669 150.697 179.323 203.212 226.505 249.633 281.422]' one = ones(11,1) X = [one, x] v = (X'*X)\(X'*y)

  23. Linear v = -3783.9 2.0253 plot (x,y, 'x', x, v(1)+x*v(2))

  24. Quadratic Multiplica elementos 1 a 1 x = (1900:10:2000)' y = [ 75.995 91.972 105.711 123.203 131.669 150.697 179.323 203.212 226.505 249.633 281.422]’ one = ones(11,1) X = [one, x, x.*x] v = (X'*X)\(X'*y)

  25. Dataset: US Census x = 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 y = 75.9950 91.9720 105.7110 123.2030 131.6690 150.6970 179.3230 203.2120 226.5050 249.6330 281.4220 x.*x = 3610000 3648100 3686400 3724900 3763600 3802500 3841600 3880900 3920400 3960100 4000000

  26. Quadratic w = 3.2294E4 -34.985 0.0095 plot (x,y, 'x', x,v(1)+x*v(2),x,w(1)+x*w(2)+x*x*w(3))

  27. Quadratic In 2010: Linear 286.9 mi Quadratic 311.6 mi Real Result 308,745,538 w = 3.2294E4 -34.985 0.0095 plot (x,y, 'x', x,v(1)+x*v(2),x,w(1)+x*w(2)+x*x*w(3))

  28. Other Linear methods • Existemoutros métodos lineares, baseados no LMS: • Weighted linear leastsquares • Robustleastsquares: • Leastabsoluteresiduals (LAR). • Bisquareweights. • E existem métodos não lineares…

  29. Weighted linear leastsquares • Itisusuallyassumedthatthe response data isofequalqualityand, therefore, has constantvariance. • Ifthisassumptionisviolated, yourfitmight be influenced by data ofpoorquality. • Toimprovethefit, you can use anadditionalscale factor (theweight) isincluded in thefittingprocess.

  30. Weighted linear leastsquares Weightedleast-squaresregressionminimizesthe error estimate: wherewiare theweights. Theweights determine how mucheach response valueinfluencesthe final parameterestimates.

  31. Weighted linear leastsquares Weightingyour data isrecommendediftheweights are known, orifthereisjustificationthattheyfollow a particular form. Theweightsmodifytheexpressionfortheparameterestimatesb:

  32. Weighted linear leastsquares Theweightsyousupplyshouldtransformthe response variancesto a constantvalue. Ifyou know thevariancesofthemeasurementerrors in your data, thentheweights are given by:

  33. Example: height x shoe size The equation: Becames: v = (X'*([W.*X(:,1),W.*X(:,2)]))\ (X'*(W.*y)) W is the weight vector: W = what you desire…

  34. If wi = 1, is the same as LMS

  35. If wi = selective…

  36. If w W=abs(1./(y-X*v))

  37. All together now wi=ones(10,1) wi=selective

  38. RobustLeastSquares • Itisusuallyassumedthatthe response errorsfollow a normal distribution, andthat extreme values are rare. Still, extreme valuescalledoutliersdo occur. • Themaindisadvantageofleast-squaresfittingisitssensitivitytooutliers. • Outliershave a largeinfluenceonthefitbecausesquaringtheresidualsmagnifiestheeffectsofthese extreme data points.

  39. Outliers (wikipedia) • In statistics, anoutlierisanobservationthatisnumericallydistantfromtherest of the data. • Grubbsdefinedanoutlier as: • Anoutlyingobservation, oroutlier, isonethatappearstodeviatemarkedlyfromothermembers of thesample in whichitoccurs.

  40. Outliers – Causes (wikipedia) • Outliersariseduetochanges in systembehaviour, fraudulentbehaviour, human error, instrument error orsimplythrough natural deviations in populations. • A physicalapparatusfortakingmeasurementsmaysuffera transientmalfunction. • Error in data transmissionortranscription.

  41. Outliers – Causes (wikipedia) • Outliersariseduetochanges in systembehaviour, fraudulentbehaviour, human error, instrument error orsimplythrough natural deviations in populations. • A samplemayhavebeencontaminatedwithelementsfromoutsidethepopulationbeingexamined. • Alternatively, anoutliercould be theresult of a flaw in theassumedtheory, callingforfurtherinvestigationbytheresearcher.

  42. Outliers

  43. Outliers - CAUTION Unlessit can be ascertainedthatthedeviationisnotsignificant, itisill-advisedto ignore thepresence of outliers. Outliersthatcannot be readilyexplaineddemandspecialattention. NUNCA DESPRESE UM PONTO!

  44. Outliers

  45. RobustLeastSquares • Tominimizetheinfluenceofoutliers, you can fityour data usingtworobustregressionmethods: • Leastabsoluteresiduals (LAR): finds a curve thatminimizestheabsolutedifferenceoftheresiduals. • Bisquareweights: minimizes a weightedsumofsquares, wheretheweightgiventoeach data pointdependson how far thepointisfromthefittedline.

  46. Robust Least Squares

  47. Non Linear LeastSquares • Thenonlinearleast-squaresformulation can be usedtofit a nonlinearmodelto data. • A nonlinearmodelisdefined as anequationthatisnonlinear in thecoefficients, or a combination of linear and nonlinear in thecoefficients. • Forexample, Gaussians, ratios of polynomials, and powerfunctions are allnonlinear

  48. Non Linear LeastSquares Thebasis of themethodistoapproximatethemodelby a linear one and to refine theparametersbysuccessiveiterations. Thederivatives are functions of boththeindependent variable and theparameters, so thesegradientequations do nothave a closedsolution.

More Related