580 likes | 594 Views
Learn about the basics of neural networks and how to use Keras, a high-level library in Python, to build and train models. Explore parameter settings, activation functions, optimization methods, error functions, and more.
E N D
COMP4332/RMBI4310 Neural Network (Keras) Prepared by Raymond Wong Presented by Raymond Wong raywong@cse
We have just finished describing the concepts related to classification • We could go back to “Neural Network”
Neural Network • Neural Network Concept • Keras Tool
Keras Tool • Keras is a high-level library in Python supporting some data analytics tools (e.g., neural network) • It is built on a well-established low-level library (e.g., TensorFlow and Theano)
Keras Tool • Keras is the first high-level library added to the core TensorFlow at Google. • It was started to be incorporated in 2017. • Many industrial people uses Keras (on top of TensorFlow).
Keras Tool • Some people wrote the following. • “Using TensorFlow makes me feel like I'm not smart enough to use TensorFlow; • whereas using Keras makes me feel like neural networks are easier than I realized.”
Keras Tool • There are many other existing Python libraries for data analytics • TensorFlow • Theano • SciKit-Learn • Caffe • PyTorch • Microsoft Cognitive Toolkit • Apache MXNet It was developed for “Google Brain” It was developed by University of Montreal It was initiated by a student in "Google Summer of Code" It was developed by UC Berkeley It was developed by Facebook It was developed by Microsoft It was developed by Apache Software Foundation
Outline • Summary about Parameter Setting • First Keras Program • Enhanced Keras Program • Efficiency • How to Set Parameter Values
Preliminary Summary about Parameter Setting • Neural Network Model Parameter • No. of layers • No. of neurons in each layer • Connection between neurons from different layers • Activation Function • Optimization Method • Error Function linear, rectifier, sigmoid, tanh adam SGD rmsprop Binary Cross Entropy, mse, mae • Training (Time) Parameter • No of epochs • Batch size We could set “no. of epochs = 150” as a stopping condition We could set “Batch Size= 10” (for example)
Final Summary about Parameter Setting • Neural Network Model Parameter • No. of layers • No. of neurons in each layer • Connection between neurons from different layers • Activation Function • Optimization Method • Error Function linear, rectifier, sigmoid, tanh adam SGD rmsprop Binary Cross Entropy, mse, mae • Training (Time) Parameter • No of epochs • Batch size We could set “no. of epochs = 150” as a stopping condition We could set “Batch Size= 10” (for example) • Evaluation • Measurement • Training/Validation/Test e.g., accuracy (or in short, “acc”) e.g., percentage of the data for the validation/test set
Outline • Summary about Parameter Setting • First Keras Program • Enhanced Keras Program • Efficiency • How to Set Parameter Values
2. First Keras Program • Consider the following prediction task. • We are given a dataset containing 768 records with 8 input attributes and 1 target attribute • We want to build a neural network • We are also given another dataset containing 768 records with 8 input attributes (but without the target attribute). • Based on the neural network, we want to predict the target attribute of each record in the second dataset.
2. First Keras Program • Before we write our first Keras program, • we should have a design of our model (e.g., neural network) • We should know what parameters we need to set for the model
2. First Keras Program (Summary about Parameters) • Neural Network Model Parameter • No. of layers • No. of neurons in each layer • Connection between neurons from different layers • Activation Function • Optimization Method • Error Function linear, rectifier, sigmoid, tanh adam SGD rmsprop Binary Cross Entropy, mse, mae • Training (Time) Parameter • No of epochs • Batch size We could set “no. of epochs = 150” as a stopping condition We could set “Batch Size= 10” (for example) • Evaluation • Measurement • Training/Validation/Test e.g., accuracy (or in short, “acc”) e.g., percentage of the data for the validation/test set
2. First Keras Program • We define the model as follows with the activation functions specified.
2. First Keras Program Rectifier function Rectifier function Fully connected input x1 N1,1 ... Sigmoid function N2, 1 x2 N1, 2 output N2, 2 x3 N1, 3 N3, 1 y1 ... Fully connected ... ... Fully connected N2, 8 x8 N1, 12 Hidden layer Hidden layer Output layer Input layer
2. First Keras Program • We set the optimizer method as “adam”. • We set the error function to be “Binary Cross Entropy”
2. First Keras Program (Summary about Parameters) • Neural Network Model Parameter • No. of layers • No. of neurons in each layer • Connection between neurons from different layers • Activation Function • Optimization Method • Error Function Done! Sigmoid function Rectifier function adam Binary Cross Entropy • Training (Time) Parameter • No of epochs • Batch size We could set “no. of epochs = 150” as a stopping condition Done! We could set “Batch Size= 10” (for example) • Evaluation • Measurement • Training/Validation/Test accuracy Done! e.g., percentage of the data for the validation/test set In this first program, we only have the training dataset only.
2. First Keras Program • We are ready to write our first Keras program.
Data Collection Data Processing Collected Data Processed Data Raw Data Result Presenting Data Mining Processed Data Presentable Forms of Data Mining Results Data Mining Results
We have to define some “data mining” models to perform some “data mining” tasks We could call many existing libraries to complete these “data mining” tasks Data Mining Processed Data Data Mining Results
Phase 1:ModelTraining Phase 2:Model Storing Training/Validation/Test Data Model (In Memory) Model (In Disk) Processed Data Data Mining Results Phase 4:New DataPrediction Phase 3:ModelReading Model (In Memory) Model (In Disk) PredictedResult New Data Data Mining Results Processed Data
Python trainingDataFilename = "Training-Dataset1-NoOfDim-8-Target-Binary.csv" newInputAttributeDataFilename = "New-Dataset1-NoOfDim-8-Target-None.csv" newTargetAttributeDataFilename = "New-Dataset1-Target-Output.csv" modelFilenamePrefix = "neuralNetworkModel" # Phase 1: to train the model print("Phase 1: to train the model...") model = trainModel(trainingDataFilename) # Phase 2: to save the model to a file print("Phase 2: to save the model to a file...") saveModel(model, modelFilenamePrefix) # Phase 3: to read the model from a file print("Phase 3: to read the model from a file...") model = readModel(modelFilenamePrefix) # Phase 4: to predict the target attribute of a new dataset based on a model print("Phase 4: to predict the target attribute of a new dataset based on a model...") predictNewDatasetFromModel( newInputAttributeDataFilename, newTargetAttributeDataFilename, model)
Phase 1:ModelTraining Phase 2:Model Storing Training/Validation/Test Data Model (In Memory) Model (In Disk) Phase 4:New DataPrediction Phase 3:ModelReading Model (In Memory) Model (In Disk) PredictedResult New Data
Phase 1:ModelTraining Training/Validation/Test Data Model (In Memory) To read the dataTo split the data into the input attributes and the target attribute There are the following 5 steps. Step 1: to load the data Step 2: to define the model Step 3: to compile the model Step 4: to fit the model Step 5: to evaluate the model To define the “structure” of the model To define how to update the parameter used in the “structure” of the model To train the model with the given data To evaluate the data
Python to set the "fixed" seed of a random number generator used in the "optimization" tool in the neural network model The reason why we fix this is to reproduce the same output each time we execute this program In practice, you could set it to any number (or, the current time) (e.g., “numpy.random.seed(time.time())”) # to train a model def trainModel(trainingDataFilename): numpy.random.seed(11) # Step 1: to load the data print(" Step 1: to load the data...") dataset = numpy.loadtxt(trainingDataFilename, delimiter=",") Step 1a: to read the dataset with "numpy" function X = dataset[:,0:8] Y = dataset[:,8] Step 1b: to split the dataset into two datasets
Python # Step 2: to define the model print(" Step 2: to define the model...") model = Sequential() model.add(Dense(12, input_dim=8, activation='relu')) model.add(Dense(8, activation='relu')) model.add(Dense(1, activation='sigmoid')) Rectifier function Sigmoid function
2. First Keras Program Rectifier function Rectifier function Fully connected input x1 N1,1 ... Sigmoid function N2, 1 x2 N1, 2 output N2, 2 x3 N1, 3 N3, 1 y1 ... Fully connected ... ... Fully connected N2, 8 x8 N1, 12 Hidden layer Hidden layer Output layer Input layer
Python # Step 3: to compile the model print(" Step 3: to compile the model...") model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"]) # Step 4: To fit the model print(" Step 4: to fit the model...") model.fit(X, Y, epochs=150, batch_size=10) # Step 5: To evaluate the model print(" Step 5: to evaluate the model...") scores = model.evaluate(X, Y) print("") print("{}: {}".format(model.metrics_names[1], scores[1]*100)) return model
Output Using TensorFlow backend. Phase 1: to train the model... Step 1: to load the data... Step 2: to define the model... Step 3: to compile the model... Step 4: to fit the model... Epoch 1/150 768/768 [==============================] - 0s 318us/step - loss: 3.5684 - acc: 0.5313 Epoch 2/150 768/768 [==============================] - 0s 83us/step - loss: 1.1656 - acc: 0.6549 Epoch 3/150 768/768 [==============================] - 0s 71us/step - loss: 0.9109 - acc: 0.6458 Epoch 4/150 768/768 [==============================] - 0s 81us/step - loss: 0.7756 - acc: 0.6471 Epoch 5/150 768/768 [==============================] - 0s 70us/step - loss: 0.6911 - acc: 0.6432 … Epoch 148/150 768/768 [==============================] - 0s 79us/step - loss: 0.4676 - acc: 0.7760 Epoch 149/150 768/768 [==============================] - 0s 62us/step - loss: 0.4663 - acc: 0.7799 Epoch 150/150 768/768 [==============================] - 0s 82us/step - loss: 0.4590 - acc: 0.7721 Step 5: to evaluate the model... 768/768 [==============================] - 0s 45us/step acc: 76.953125
Phase 1:ModelTraining Phase 2:Model Storing Training/Validation/Test Data Model (In Memory) Model (In Disk) Phase 4:New DataPrediction Phase 3:ModelReading Model (In Memory) Model (In Disk) PredictedResult New Data
Phase 2:Model Storing Model (In Memory) Model (In Disk) In Keras, we have to store the neural network model into two components. • The model structure (stored in JSON format) • The model weight information (stored in HDF5 format)
Python # to save a model def saveModel(model, modelFilenamePrefix): structureFilename = modelFilenamePrefix + ".json" model_json = model.to_json() with open(structureFilename, "w") as f: f.write(model_json) Step 1: to save the model structure to a file in the JSON format Step 2: to save the model weight information to a file in the HDF5 format weightFilename = modelFilenamePrefix + ".h5" model.save_weights(weightFilename)
neuralNetworkModel.json {"class_name": "Sequential", "config": [{"class_name": "Dense", "config": {"name": "dense_1", "trainable": true, "batch_input_shape": [null, 8], "dtype": "float32", "units": 12, "activation": "relu", "use_bias": true, "kernel_initializer": {"class_name": "VarianceScaling", "config": {"scale": 1.0, "mode": "fan_avg", "distribution": "uniform", "seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}, {"class_name": "Dense", "config": {"name": "dense_2", "trainable": true, "units": 8, "activation": "relu", "use_bias": true, "kernel_initializer": {"class_name": "VarianceScaling", "config": {"scale": 1.0, "mode": "fan_avg", "distribution": "uniform", "seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}, {"class_name": "Dense", "config": {"name": "dense_3", "trainable": true, "units": 1, "activation": "sigmoid", "use_bias": true, "kernel_initializer": {"class_name": "VarianceScaling", "config": {"scale": 1.0, "mode": "fan_avg", "distribution": "uniform", "seed": null}}, "bias_initializer": {"class_name": "Zeros", "config": {}}, "kernel_regularizer": null, "bias_regularizer": null, "activity_regularizer": null, "kernel_constraint": null, "bias_constraint": null}}], "keras_version": "2.1.1", "backend": "tensorflow"}
neuralNetworkModel.h5 <the HDF5 format which could not be readable in a TEXT editor>
Phase 1:ModelTraining Phase 2:Model Storing Training/Validation/Test Data Model (In Memory) Model (In Disk) Phase 4:New DataPrediction Phase 3:ModelReading Model (In Memory) Model (In Disk) PredictedResult New Data
In Keras, we have to read the neural network model from the two components • The model structure (stored in JSON format) • The model weight information (stored in HDF5 format) Phase 3:ModelReading Model (In Memory) Model (In Disk)
Python # to read a model def readModel(modelFilenamePrefix): structureFilename = modelFilenamePrefix + ".json" with open(structureFilename, "r") as f: model_json = f.read() model = model_from_json(model_json) Step 1: to load the model structure from a file in the JSON format Step 2: to load the model weight information from a file in the HDF5 format weightFilename = modelFilenamePrefix + ".h5" model.load_weights(weightFilename) return model
Phase 1:ModelTraining Phase 2:Model Storing Training/Validation/Test Data Model (In Memory) Model (In Disk) Phase 4:New DataPrediction Phase 3:ModelReading Model (In Memory) Model (In Disk) PredictedResult New Data
There are the following steps. Step 1: to load the new data (input attributes) Step 2: to predict the target attribute of the new data based on a model Step 3: to save the predicted target attribute of the new data into a file Phase 4:New DataPrediction Model (In Memory) PredictedResult New Data
Python # to predict the target attribute of a new dataset based on a model def predictNewDatasetFromModel( newInputAttributeDataFilename, newTargetAttributeDataFilename, model ): newX = numpy.loadtxt(newInputAttributeDataFilename, delimiter=",") Step 1: to load the new data (input attributes) Step 2: to predict the target attribute of the new data based on a model newY = model.predict(newX, batch_size=10) Step 3: to save the predicted target attribute of the new data into a file numpy.savetxt(newTargetAttributeDataFilename, newY, delimiter=",", fmt="%.10f")
New-Dataset1-Target-Output.csv 0.7116840482 0.0631212220 0.5842617750 0.1779389381 0.8065432310 0.2070244253 0.3177646995 0.3444415629 0.9888448119 0.0222690590 0.1498738825 …
Outline • Summary about Parameter Setting • First Keras Program • Enhanced Keras Program • Efficiency • How to Set Parameter Values
3. Enhanced Keras Program • We could generate the validation/test set in Keras automatically. • In Keras, the term “validation” set means the original term “test” set learnt by us. • That is, the “validation” set named by Keras is used to measure the performance of the model (not to fine-tune the model) • Fine-tuning could be done by us “manually”.
3. Enhanced Keras Program • If we want to select 20% of the given data as the validation set, we need to include the following as an input argument of the “fit” function. validation_split=0.2
3. Enhanced Keras Program • These 20% of records will be used for the validation purpose • They will not be used for training. • The remaining 80% of records will be used as a training set for training.
Original Code Python # Step 4: To fit the model print(" Step 4: to fit the model...") model.fit(X, Y, epochs=150, batch_size=10) Updated/Enhanced Code Python # Step 4: To fit the model print(" Step 4: to fit the model...") model.fit(X, Y, validation_split=0.2, epochs=150, batch_size=10)
Output Using TensorFlow backend. Phase 1: to train the model... Step 1: to load the data... Step 2: to define the model... Step 3: to compile the model... Step 4: to fit the model... Train on 614 samples, validate on 154 samples Epoch 1/150 614/614 [==============================] - 0s 401us/step - loss: 4.0245 - acc: 0.5114 - val_loss: 1.9185 - val_acc: 0.5455 Epoch 2/150 614/614 [==============================] - 0s 86us/step - loss: 1.3709 - acc: 0.6107 - val_loss: 1.0007 - val_acc: 0.6753 Epoch 3/150 614/614 [==============================] - 0s 90us/step - loss: 0.9978 - acc: 0.6515 - val_loss: 0.8329 - val_acc: 0.6688 Epoch 4/150 614/614 [==============================] - 0s 86us/step - loss: 0.8546 - acc: 0.6547 - val_loss: 0.7530 - val_acc: 0.6364 Epoch 5/150 614/614 [==============================] - 0s 75us/step - loss: 0.7764 - acc: 0.6547 - val_loss: 0.7389 - val_acc: 0.6234 … Epoch 148/150 614/614 [==============================] - 0s 76us/step - loss: 0.4814 - acc: 0.7736 - val_loss: 0.5738 - val_acc: 0.7403 Epoch 149/150 614/614 [==============================] - 0s 104us/step - loss: 0.4780 - acc: 0.7687 - val_loss: 0.5596 - val_acc: 0.7532 Epoch 150/150 614/614 [==============================] - 0s 63us/step - loss: 0.4833 - acc: 0.7720 - val_loss: 0.5690 - val_acc: 0.7143 Step 5: to evaluate the model... 768/768 [==============================] - 0s 24us/step acc: 76.04166666666666
Outline • Summary about Parameter Setting • First Keras Program • Enhanced Keras Program • Efficiency • How to Set Parameter Values
4. Efficiency • A typical machine/PC has a CPU (i.e., Central Processing Unit) (with 1 processor or multiple processors) • Usually, each processor is very fast (e.g., 3.8 GHz) • However, we have a limited number of “fast” processors (e.g., 8 processors)