1 / 15

MetaNeural Data Mining with Neural Networks

Explore the Delta Rule, Backpropagation, and MetaNeural format for predictive data mining with neural networks. Analyze Iris and Magnetocardiogram data using weights mapping inputs to outputs. Learn about neural networks as a collection of McCulloch-Pitts neurons, with terminology and data preparation techniques. Utilize the built-in neural network modules in Analyze Code. Generate and scale data for effective training and validation. Follow the guidelines for MetaNeural Input File to enhance your data mining process.

Download Presentation

MetaNeural Data Mining with Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Neural networks – Hands on • Delta rule and Backpropagation algorithm • MetaNeural format for predictive data mining • Iris Data • Magnetocardiogram data

  2. Neural net yields weights to map inputs to outputs Neural Network  Molecular weight w11 h w11   Boiling Point H-bonding   Biological response Hydrofobicity  h Electrostatic interactions w23  w34 Observable Projection Molecular Descriptor There are many algorithms that can determine the weights for ANNs RENSSELAER

  3. x 1 w 1 w 2 S f() y w 3 x 3 w N x N McCulloch-Pitts neuron RENSSELAER

  4. 1 w 2 S w x Output f() 11 11 1 neuron 1 w 3 S w f() 12 11 y 1 w 13 S S f() x f() 2 1 w 22 S 3 w f() 1 21 w 23 S 2 w f() 32 Second hidden layer First hidden layer Neural network as collection of M-P neurons RENSSELAER

  5. Standard Data Mining Terminology • Basic Terminology • - MetaNeural Format • - Descriptors, features, response (or activity) and ID • - Classification versus regression • - Modeling/Feature detection • - Training/Validation/Calibration • - Vertical and horizontal view of data • Outliers, rare events and minority classes • Data Preparation • - Data cleansing • - Scaling • Leave-one-out and leave-several-out validation • Confusion matrix and ROC curves

  6. Standard Data Mining Terminology • Basic Terminology • - MetaNeural Format • - Descriptors, features, response (or activity) and ID • - Classification versus regression • - Modeling/Feature detection • - Training/Validation/Calibration • - Vertical and horizontal view of data • Outliers, rare events and minority classes • Data Preparation • - Data cleansing • - Scaling • Leave-one-out and leave-several-out validation • Confusion matrix and ROC curves

  7. TERMINOLOGY • Standard Data Mining Problem • Header and Data • MetaNeural Format • - descriptors and/or features • - response (or activity to predict) • - pattern ID • - data matrix • Validation/Calibration • Training/Validation/Test Set Demo: iris_view.bat

  8. UC URVINE DATA REPOSITORY Datafile Name: Fisher's Iris Datafile Subjects: Agriculture , Famous datasets Description: This is a dataset made famous by Fisher, who used it to illustrate principles of discriminant analysis. It contains 6 variables with 150 observations. Reference: Fisher, R. A. (1936). The Use of Multiple Measurements in Axonomic Problems. Annals of Eugenics 7, 179-188. Story Names: Fisher's Irises Authorization: free use Number of cases: 150 Variable Names: 1.Species_No: Flower species as a code 2.Species_Name: Species name 3.Petal_Width: Petal Width 4.Petal_Length: Petal Length 5.Sepal_Width: Sepal Width 6.Sepal_Length: Sepal Length

  9. S S S S S • ANALYZE code has neural networks modules built-in • Either run: analyze root.pat 4331 (single training and testing) analyze root.pat 4332 (LOO) analyze root.txt 4333 (bootstrap mode) • Results for analyze are in resultss.xxx and resultss.ttt • Note that patterns have to be properly scaled first • The file name meta overrides the default input file for analyze

  10. Neural Network Module in Analyze Code ROOT ROOT.PAT ROOT.TES (ROOT.WGT) (ROOT.FWT) (ROOT.DBD) • Use Analyze root 4331 for easy way • (the file meta let you override defaults) Analyze resultss.XXX resultss.TTT ROOT.TRN (ROOT.DBD) ROOT.WGT ROOT.FWT

  11. MetaNeural Input File for the ROOT Generating and Scaling Data 4 => 4 layers 2 => 2 inputs 16 => # hidden neurons in layer #1 4 => # hidden neurons in layer# 2 1 => # outputs 300 => epoch length (hint:always use 1, for the entire batch) 0.01 => learning parameters by weight layer (hint: 1/# patterns or 1/# epochs) 0.01 0.01 0.5 => momentum parameters by weight layer (hint use 0.5) 0.5 0.5 10000000 => some very large number of training epochs 200 => error display refresh rate 1 =>sigmoid transfer function 1 => Temperature of sigmoid check.pat => name of file with training patterns (test patterns in root.tes) 0 => not used (legacy entry) 100 => not used (legacy entry) 0.02000 => exit training if error < 0.02 0 => initial weights from a flat random distribution 0.2 => initial random weights all fall between –2 and +2

  12. Generating and Scaling Iris Data

  13. Run Neural Net for Iris Data

More Related