How did we reduce our data, computing, and storage requirements by roughly 96.6 % from the peak of 3 TB per day? Without dip in performance matrix
Shailendra Singh Kathait
Co-Founder & Chief Data Scientist @ Valiance | Envisioning a Future Transformed by AI | Harnessing AI Responsibly | Prioritizing Global Impact |
We drastically reduced the number of images to be classified by an object identification algorithm, resulting in significantly lower use of data networks and compute requirements.
In our current deployment of an intrusion detection system, we use an intelligent camera unit to transmit images for intrusion detection, followed by species classification to trigger an alert. At a frequency of 2 images per second, we receive 120 images per minute, which leads to 7,200 photos in one hour and 172,800 pictures within 24 hours per camera unit. This is just from a single camera, whereas we are using 10’s of cameras to run intrusion detection.
However, transmitting such a massive amount of data to the cloud poses a significant challenge. Considering each image size to be around 600KB, a staggering 121 GB of data would be required for a single day per camera. This substantial amount of data could lead to remarkably high network usage, especially if deployed in remote areas.
To reduce data consumption, we implemented edge analytics by identifying significant photo changes and sending them to the cloud. This intelligent processing involves leveraging machine learning capabilities at the edge level. It begins by capturing a reference image and comparing it to subsequent images. The following images are not sent to the cloud if they appear similar. However, if a change is detected, the corresponding image is transmitted to the cloud for further analysis.
The reference image is updated periodically to match the environments, like sunlight, and social patterns, enabling accurate comparisons over time. The reference image is set for every 15-minute block, and every fourth week, a new set of reference images are stacked to consider the environment changes.
To begin, we observe that in Figures 1, 2, and 3, the areas on the left and right primarily consist of plantations. It is improbable for any animals to traverse through those regions. Also, it is seen in Fig. 2. a. that an insect appears in the image. Consequently, we opt to crop the left and right parts of the images to mitigate unnecessary image transmissions caused by leaf movement and insect images, which the model interprets as changes. This approach is illustrated in Figures 1.a., 2.a., and 3.a., effectively confining the transmission of redundant images.
Furthermore, we can minimize the number of images transmitted by ensuring no photos are sent unless an object is detected.?
This concept is demonstrated by examining figures 1.a., 2.a., and 3. a. In the initial image (fig. 1.a.), which serves as the reference image, subsequent images such as fig. 2.a., although similar to the reference, do not contain any animals and thus do not require transmission. On the other hand, in Fig. 3.a., an animal is present, making it a candidate for transmission. This approach allows us to transmit only those images that capture relevant objects selectively.
领英推荐
The following approaches are proposed for analysis, as shown in Fig. 4.
In the first approach, we calculated the signal to noise ratio (SNR) of images with respect to the reference image. Distribution of the SNR probability values are plotted in fig. 5.
Here if we see the likelihood of an orange line in the SNR range between 20 to 40 is more. So, these images will be transmitted. While the SNR values 40 and beyond are less, as shown by the blue lines, these images won’t be transferred for further classification through a deep network model on cloud.
In the second approach, we found the density plot of the peak difference value, as shown in Fig. 6.
Again, This curve can be used for determining images to be transmitted as the classified images have average difference values greater than four compared to non-classified photos with less than 4.
The third approach uses a density plot of similarity scores, as shown in Fig. 7. The similarity scores are plotted against the probability density function. The images with a high probability density value in the score range of 85 to 92 can be transmitted for classified files, as shown by the orange lines.?
The fourth approach uses a density plot of the mean square error of images, as shown in fig.8. A graph is plotted between the probability density and mean square error. The chart shows that the classified images from mean square error value range 18 to 27 can be transmitted as they have high probability density values.
Implementing these approaches reduces the amount of data transmitted and has cost-saving across networks, computing, and storage. With fewer images being sent for classification, the number of compute engines required in the cloud, which utilize custom deep neural networks and machine learning algorithms, can be significantly reduced. This results in more efficient resource utilization and potentially lowers the overall computational costs compared to the previous scenario when all 172,800 images or 121 GB of data were transmitted for just one camera.
In our current project, we reduced our data, computing, and storage requirements by roughly 96.6 %, from the peak of 3 TB per day
10 + Years Professional Specializing in Production and Project Management for FMCG Sector | BRC Certified Lead Auditor & Internal Auditor
1 年Great work sir. When will you visit us to Tadoba?
Engineering Manager -Data & Analytics | Data Architect | AWS | Azure | Big Data Hadoop |Snowflake
1 年Great work sir ..Can u please let us know which database you choose to store this images and reason behind choosing that database..
Entrepreneur, Country Manager, Enterprise Software Sales, Motivational Leader
1 年Amazing !!
Technology, Business &Thought Leader- Analytics and AI expert
1 年Well done team Valiance
Hi Shailendra Singh! The combination of machine learning, deep learning, and cloud computing is intriguing. How do you see these technologies driving innovation and transforming the future? Share your thoughts with us!