Stray Animals Rescue AI Project using customized Computer Vision models with Supervisely, Scraping -Selenium and AWS Cloud Cluster. (Not AI shelters)
Pranav Shekhar
Deloitte US India | MLOps | DevOps - AL | Web Dev - MERN | Flutter & Firebase | Cloud - Hybrid & Multi | GAIT - AIR 50
So this ambitious project is about an idea which coined in my mind few days back and when I was wandering around the streets in this pandemic for regular groceries, I was deeply saddened and taken aback with the condition of stray animals in my locality which lead me to think regarding this project.
What is the idea?
This idea honestly is totally realistic and achievable and I have made the initial efforts to take this forward, - we will be placing small IP cameras on public vehicles and train our Computer Vision customized model to detect stray animals in different localities and this detection will trigger the Geo-IP API to submit the exact latitude and longitude co-ordinates of that location in a centralized database which can be used by organisations and NGOs like PETA and SPCA for further rescue and treatment of the affected animals. This was the result I tested :-
So I will be guiding you through - step by step like in my previous articles that how you can work on a similar idea in a completely different application and contribute to the open source society.
What tools we will be using?
- Computer Vision Library - There are many different Computer Vision Libraries which we could have used but I prefer OpenCV while working with Python, you can go for your own personal preference.
- Scraping Tool - We will be using Selenium for Web Scraping to get our image data of the different stray animals to train our model as we know collecting the image dataset can be a tedious task so we will be using image data scraping to our advantage.
- Image Annotation & Augmentation Tool - We will be using Supervisely for annotation and augmentation of our image dataset which we will discuss about from basics later in this article.
- GPU and NVIDIA Powered Cluster - We know that our local physical resources can never meet the requirements of high performance AI enabled machines so we will be using AWS Deep Learning AMI instance to meet our GPU, Multi CPU and high graphics requirements.
1. Collecting our image dataset with Selenium :-
This could have been a very hectic task to collect and download huge number of filtered images for our appropriate image dataset and processed clean images from Kaggle or any other resource are never appropriate for the accuracy and training of our models to suit real world requirements, this is when Web Scraping comes to our rescue, you see this magic yourself, I just coded the search term 'Stray Animals' and everything I needed was in my folder:-
But we will have to convert these images to Gray Scale 2D for faster processing and to reduce the complexity of RBG values processing by the Kernel, this small piece of code after scraping might help :-
And this is the result we get, just as we needed:-
2. Uploading our Dataset to Supervisely :-
Supervisely is a powerful open source computer vision platform where developers and data scientists can play with their datasets and test them with different neural network architectures , with supervisely we can label images, videos, 3D point clouds, volumetric slices and other data in the best labeling and annotation tool.
As stated earlier, we will be using Supervisely for image augmentation and annotation of our uploaded dataset and test it with 'Mask RCNN Neural Network' which is used for 'regional based instance(object) segmentation' utilizing the principle of Convolution Neural Network which in simpler words - is a modern technique to train AI machines and make them learn and recognize patterns by creating a replica of the learning pattern of the human brain as depicted below :-
Convolutional Neural Network Architecture of Mask RCNN :- We shall discuss about this complex architecture in a separate future article on Mask RCNN.
We will upload our dataset to our Supervisely account and import it to a new project while working in the default team :- Click on IMPORT DATA
So we will be uploading atleast 50 -100 images from our Dataset folder so that we can perform image annotation & augmentation on our Dataset such that we have more Data to train to improve our testing accuracy.
3. Annotating and Augmentation of our Dataset on Supervisely:-
We will annotate our images one by one manually by using the smart annotation tool save our time keeping in mind the class of the instance segmented animal :-
Now we will be using DTL - Dialog Tag Language for the simplicity in augmentation, we can also go for a Python Script and use the augmentation 'image' and 'pillow' libraries to satisfy our requirement :-
What will image augmentation do to our dataset? - It will basically horizontally and vertically flip and rotate our images at different angles to increase the size of our dataset for better training and place a 3D model above 2D images to be utilized in an application.
Now we will create training and validation sets of our image data to train it on our customized neural network model using high performance cluster on AWS Cloud:- Click on Run DTL > Train/Val tagging :-
4. Training our Neural Network Model on the Cloud over a GPU and NVIDIA powered cluster :-
We have already argued in favor of the need of a cloud high performance cluster for training our model on top to tackle our problem of limited hardware resources thus provisioning our required instance on the cloud , I will use AWS cloud with a developer account, you can use other public cloud services like GCP and Azure to suit your requirement on the free tier :- Login to AWS > EC2 Dashboard > Launch Instance > Select Deep Learning AMI as shown below :-
Go for the p2.xlarge instance type as it the cheapest and most affordable spot instance, and then go the next steps with default configuration, saving your key and launch the instance. Open Command Prompt on Windows and type the following command to login remotely to your instance for connecting with the Supervisely Master :-
C:\Users\KIIT\Downloads> ssh -l ubuntu -i keyname.pem <public_IP_of_instance>
Once connected we run the following command we copied from Supervisely Cluster Dashboard, on our cloud instance to connect to Supervisely Master and attach to the cluster :-
bash <(curl -fsSLg "https://app.supervise.ly/api/agent/iDu0PEelm7hdXTOLXCbXoymPgHASnOHq?agentImage=supervisely/agent:latest")
Now we get back to Supervisely wait for instance to connect and be up and running once it is up we go to Neural Networks > Add > Mask R-CNN (Keras + TF) (COCO) and train our train dataset on the customized model to perform instance segmentation for our required application with the default DTL code provided, we can play with learning rate and the epochs:-
Finally, after the model is trained we save it and use our customized weight in our application for instance segmentation. We can analyze the chart and the workflow pipeline of our trained model :-
Thanks for your patience and interest in this article. You can DM me if you are looking for a similar application or have any doubts regarding the same. I will be implementing the Geo-IP part with a centralized storage and monitoring using DevOps tools and adding in to this article in the near future.