How to evaluate your Deep Learning models in Imagimob AI?

How to evaluate your Deep Learning models in Imagimob AI?

Hi,

One of the difficult parts of Machine Learning (ML) is evaluating the model’s performance. So how do you measure the success of a machine learning model? How do you know when to stop the training? What metrics should be chosen to evaluate the model? In this technical newsletter, we will answer these questions.

So, how do you?download your model in Imagimob AI?

??When you have finished a training job in Imagimob AI, you can download the trained models and evaluate them. This involves looking at different statistics and metrics to understand if the model is good enough for your purpose.

In Imagimob Studio, we empower the user with a very powerful tool of letting the model label the train/validation/test data for us so that we can play it back and look at it in detail.?

In order to evaluate a model in Imagimob AI, double-click on the downloaded model file *.h5 in Imagimob Studio.?And go to?the?Evaluation?tab in the .h5 file.?This gives a good overview of the performance of your model.

In this how-to, I will focus on?Classification metrics.

?The confusion matrix is the most basic tool to evaluate a model. It tabulates all of the correct and incorrect responses a model produces given a set of data.?

??It?describes the performance of a classification model?(or “classifier”) on a set of test data for which the true values are known.?

No alt text provided for this image

In the picture above, we show a confusion matrix. From the drop-down menu, we can choose if we want to look at the train, validation, or test dataset results.?

The labels on the side correspond to the predicted labels in each sample, and the labels on the top correspond to the actual labels from the model.

This is a quick way of getting a view of your models.

Accuracy

First in the list of the metrics is model accuracy, probably the most basic and most used metric when measuring the performance of predictions. Accuracy represents the ratio between the correctly (TRUE) predicted values and all outcomes, or in other words, the ratio between True Negatives and True Positives, and all outcomes. That means, we can sum the diagonal of the matrix, and divide it by the sum of all four outcomes.

Another metric that we evaluate our models with is the F1 Score.?

Depending on the application, you may want to give higher priority to recall or precision. But there are many applications in which both recall and precision are important. Therefore, it is natural to think of a way to combine these two into a single metric. One popular metric which combines precision and recall is called the F1-score, which is the harmonic mean of precision and recall defined as:?

No alt text provided for this image

In your model evaluation in Imagimob Studio, you will see all the metrics below:

No alt text provided for this image

?What makes Imagimob AI unique in the model evaluation is how to visualize the predictions on the timeline.?

In the picture below: the model predictions are shown as solid lines with different colors depending on the class.

No alt text provided for this image

With the?Add Track Button, underneath the label tracks, we can generate a label from this model to more easily compare the model output with the label set by the human. To add the prediction tracks, click on the Add track button and select?New label track from predictions.


This label is generated by the model, visualizing each prediction made by the model together with the confidence level of that prediction. This gives us the ability to, essentially, run a field test on our computer, without deploying anything on a device, until we are pleased with the model's performance.?

?

Takeaways:

  1. A machine learning model’s score is measured on the unseen test dataset.
  2. For classification problems (discrete output) we use confusion matrices and accuracy.
  3. For unbalanced datasets, we need something more than the accuracy: Precision and recall.
  4. F1 score balances Precision and Recall.
  5. Visualizing the model output on the timeline is a good complement for time-dependent data to understand how well the model performs.

In this article, I’ve covered some of the evaluation metrics and methods for a Machine Learning algorithm. Also, we saw how the Accuracy metric can sometimes be misleading, e.g. when you have imbalanced datasets, and we saw how we can look at predictions on the timeline in a “simulated experiment” to see the model’s performance.?

?I hope this article was clear and helpful. For any questions or suggestions regarding this newsletter or our platform, feel free to contact me at [email protected].

Do you have a specific use case that you would like to test in Imagimob AI?

Feel free to schedule a free call with us and we′ll guide you through the whole process:?A technical session with Imagimob team

Wish you a Happy ML!

Alina

No alt text provided for this image

要查看或添加评论,请登录

Alina Negrutu的更多文章

社区洞察

其他会员也浏览了