Model Performance Analysis
Image from the Convolutional Neural Networks course at deeplearning.ai

Model Performance Analysis

1. Introduction

Successfully training a Machine Learning Model through the lifecycle of a data science project is a great feeling – but you’re actually not done here – except in the case of a research project or an academic exercise. In production ML systems, one needs to enter a new phase of development where you have to a deeper analysis of your ML model performance from different directions. It should be underscored that in order to do a deeper analysis of your model performance you not only have to evaluate your model performance based on the entire dataset but also various slices of data. I had given an example in this post from the CAE world which I have personally encountered whilst working on a regression problem.?

Another example could be wherein if you’re to build a ML model to predict the demand for different models of automobiles – then looking at your model performance based on different types of models, accessories, features offered, colours, etc will become important – i.e., you might want to evaluate how the model performs based on these individually. At a higher level, there are 2 main ways to analyse the model performance: Black Box evaluation and Model Introspection.?

2. Black Box Evaluation vs Model Introspection

In Black Box Evaluation [i.e., Input – Output evaluation] - you quantify the model performance through metrics and losses without going into the details of internal model working and then we have Model Introspection techniques which prove useful when you’re interested in understanding how a model is working internally – e.g., you might want to experiment with different architectures to understand how the data is flowing internally within each layer of your model.

Contrary to Black Box evaluation, in model introspection, you’re “not just” interested in the model’s final results but also in the details of each layer.?

Tools for evaluating Model Performance:

There are different tools available for evaluating Model as described above such as:

?Tensor Board

?Tensor Flow Model Analysis [TFMA]

Using TensorBoard, you can monitor the loss and accuracy at every iteration of the model, you can closely monitor the training process itself. I have found the What-if-Tool part of TensorBoard very powerful which can be run with various platforms including: Jupyter Notebooks, Collab and Cloud AI Platform Notebooks. The What-if-tool can be helpful during data collection, model creation and post-training evaluation as discussed above. The What-if tool supports Tensor Flow models out of the box and can also support models built with any other framework. I will talk more about What-if-tool in subsequent posts of this series.?

Example of Model Introspection:

As one may recall, in summary, the operations within a CNN can be bundled in to 2 main blocks: a) Feature Learning Block and b) Task Learning Block. In the feature learning block, the inputs to a CNN (say, a series of images) are processed through a series of convolutional layers during which the neural network learns the features corresponding to these images. This post shows how one can visualize the features corresponding to an image that the convnet learns through each layer using Keras APIs

3. Model Performance Analysis: Performance Metrics vs Optimization Objectives:

I have discussed about the Performance Metrics in this article highlighting about the evaluation metrics for regression and classification problems

Optimization algorithms: I have very briefly highlighted about the optimization landscape

Frameworks such as TensorFlow and all others have options for tracking performance metrics like accuracy and optimization objectives such a s loss after each epoch of training and validation.?

No alt text provided for this image


要查看或添加评论,请登录

Ajay Taneja的更多文章

社区洞察

其他会员也浏览了