70 likes | 85 Views
Dive into advanced statistical techniques such as Support Vector Machines, Multi-Dimensional Scaling, and Factor Analysis. Learn to apply Random Forest for data analytics in ITWS and CSCI.
E N D
Labs: (SVM, Multi-Dimensional Scaling, Dimension Reduction), Factor Analysis, Random Forest Peter Fox Data Analytics ITWS-4600/ITWS-6600/MATP-4450/CSCI-4960 Group 3 Lab 2, November 2, 2018
If you did not complete svm • group3/ • lab1_svm{1,11}.R • lab1_svm{12,13}.R • lab1_svm_rpart1.R
And MDS, DR • lab1_mds{1,3}.R • lab1_dr{1,4}.R • http://www.statmethods.net/advstats/mds.html • http://gastonsanchez.com/blog/how-to/2013/01/23/MDS-in-R.html
randomForest > library(e1071) > library(rpart) > library(mlbench) # etc. > data(kyphosis) > require(randomForest) # or library(randomForest) > fitKF <- randomForest(Kyphosis ~ Age + Number + Start, data=kyphosis) > print(fitKF) # view results > importance(fitKF) # importance of each predictor # what else can you do? data(swiss) # fertility? lab2_rf1.R data(Glass,package=“mlbench”) # Type ~ <what>? data(Titanic) # Survived ~ . Find - Mileage~Price + Country + Reliability + Type
Try these • example_exploratoryFactorAnalysis.R on dataset_exploratoryFactorAnalysis.csv (on website) • http://rtutorialseries.blogspot.com/2011/10/r-tutorial-series-exploratory-factor.html (this was the example on courses in the lecture) • http://www.statmethods.net/advstats/factor.html • http://stats.stackexchange.com/questions/1576/what-are-the-differences-between-factor-analysis-and-principal-component-analysi • Do these – lab2_fa{1,2,4,5}.R
Factor Analysis data(iqitems) # data(ability) ability.irt <- irt.fa(ability) ability.scores <- score.irt(ability.irt,ability) data(attitude) cor(attitude) # Compute eigenvalues and eigenvectors of the correlation matrix. pfa.eigen<-eigen(cor(attitude)) pfa.eigen$values # set a value for the number of factors (for clarity) factors<-2 # Extract and transform two components. pfa.eigen$vectors [ , 1:factors ] %*% + diag ( sqrt (pfa.eigen$values [ 1:factors ] ),factors,factors )
Glass index <- 1:nrow(Glass) testindex <- sample(index, trunc(length(index)/3)) testset <- Glass[testindex,] trainset <- Glass[-testindex,] Cor(testset) Factor Analysis?