Deep Learning in Python with TensorFlow/Keras in practical AI. Training the Sequential Feedforward architecture.

Deep Learning in Python with TensorFlow/Keras in practical AI. Training the Sequential Feedforward architecture.


In previous article, artificial neural networks were created in Python [1] using Sequential model (TensorFlow/Keras API[2,3]). Dense layers, in which all nodes from one layer are connected to all nodes from next layer, were structured into a sequence of large number of hidden layers.?Its was noted that this type of neural network is called Deep Neural Network. Using this method an architecture of Deep Neural Network was created. But Deep Neural Network model includes not just algorithm structure, but also hyperparameters and model structure, defined by specific weights for all nodes and edges. Understanding this complexity is critical in both practical and theoretical terms, but i will focus on the practical aspects in this tutorial.

To be able to train the model, neural network must first be compiled, in a process that will finalize the structure of the network and add certain optimizers and validation parameters. Complied model is then to be fit to the data, so it can ‘learn’ how to predict that type of data. Here, its important to note that the data defines what the trained model will learn, so data aspect is very very important.

To perform these steps, statements model.compile() and model.fit() can be used, with additional parameter settings like optimizers, loss functions and metrics. Model will be fitted to the data this way, so before i move to those previously mentioned steps, i will introduce an artificial practice dataset created for this tutorial... which does looks a bit complicated but in reality its just couple of functions repeated to create realistic dataset complexity. It's important to have as realistic as possible practice datasets for now (later i will move to real world datasets).

No alt text provided for this image

This is a typical situation of creating data that has 2 aspects, one data to be used as predictor train_S, and categories or labels, train_L to predict based on that data. This process is called model training.

Now, a bit of pre-processing... Sequential model() will expect an array type of data, so numpy.array() function is recommended to convert any data frame to this format before processing.

No alt text provided for this image

Data should also be resampled, to avoid trained model learning on pre-set data patterns. Once randomized any segment of the data will have equal chance of being 'learned' in each iteration. This is one of the most important aspects in relation to avoiding over-fitting. Function shuffle() can be used to perform this process.

No alt text provided for this image

To finalize this sequence of preprocessing steps data is best transformed to a minmax scale from 0 to 1 in order to improve model training process.

Before i start compiling and fitting the model, i will mention a simple principle of creating an Artificial Neural Network as noted in the Part1. So a Sequential model is created using same methodology... with Dense layers, 'relu' and 'softmax' activation functions in a Deep neural network architecture.

No alt text provided for this image

But this architecture is just one of the model segments. Optimizer, loss function, metrics and other parameters are vital segments of most AI algorithms, including Deep Neural Networks. Also this stricture should be compiled and implemented in a final structure with previously mentioned aspects.

Model compile() function is a good choice to perform this...Lets analyze the code...

No alt text provided for this image

'Adam' is one of the appropriate optimizers for this type of data and is good for learning rate adaptation which i set at 0.01. Loss would be set as sparse categorical cross entropy and the metrics of interest 'binary_crossentropy' and 'accuracy' . All these will be important in model fitting which is the next step and simple model.fit() function can be applied with data and labels specified. But this time number of epochs or iterations should be specified also and data reshuffled again. The neural network model will now start the training process and 170 epochs where specified in which model will be gradually trained. Lets view the output after 170 epochs...

No alt text provided for this image

Each epoch time consumption, loss, binary cross-entropy and most importantly accuracy can be viewed.After 170 epochs of training, neural network model managed to predict dataset with 97.02% accurate predictions, or 0.9702 accuracy, low loss and cross-enropy. But these metrics are based on how good the algorithm is at predicting the very same data used to train the model, train_S (to make sure the algorithm is good at predicting future data it should be validated and tested on new data...). For now its important to note that algorithm 'learned' how to predict data, but this is just the first step of optimizing the learning process and other 'learned' models might be created and compared against each other. This process of finding the optimal model is very important and i will go through this in next articles...

So for now, architecture of the neural network is created, but also compiled and optimized. Additionally model was fit to the data or trained. During this process, each neuron is assigned with a specific weight used to communicate with neurons from other layers and structure data in order to predict it. Finally, additions to neural network architecture, optimizers, other hyperparameters model weights are essential part of the model. These can be retrieved using model.get_weights() statement as seen in the image above. Understanding this complexity in practical terms is one of the most important thing in Deep learning, to understand that multiple features of Neural network might be segmented, what they serve for and how can they be saved, loaded for future deployment.

More on saving and loading different portions of the model in the next article! Thanks for reading.


The article above was created by:

Darko Medin, Data Science/AI Expert at Edanz Group


Spyder IDE was used to implement the code [4].


References :


1.Python Software Foundation. https://www.python.org


2.Martín Abadi et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. www.tensorflow.org.


3.Chollet Francois et al. Keras. 2015. https://keras.io


4.Raybaut, P. (2009). Spyder-documentation. Available Online at: Pythonhosted. Org.


要查看或添加评论,请登录

Darko Medin的更多文章

社区洞察

其他会员也浏览了