登录查看更多内容

Working at the Frontier of Machine Learning and Computer Vision

Ahmed Alghadani

Co-founder & CEO @byanat | Building the technology that will enable better infrastructure in telecom, energy, and utilities. From cell towers to power grids.

发布日期: 2021年4月17日

In this article, I would like to share an exceptional experience from my work as a research assistant at the Embedded and Interconnected Vision Systems lab at Sultan Qaboos University from June 2020 to December 2020. There I gained priceless experience in machine learning and computer vision. I gained this experience from working on a lot of real-life projects. Those projects ranged from Knowledge Transfer Partnerships (KTPs), governmental projects, to private consultancy and commercial tenders.

Overview

The projects encompassed working on detection and area estimation of oil spills from drone and satellite images, quality control and inspection on production lines for smart factories using industrial cameras, and identifying possible COVID-19 cases from a combination of thermal analysis and symptoms analysis such as fever, coughing, and fatigue. While working on these projects I have learnt a lot about being a team player and strategically plan towards meeting tight deadlines under the COVID-19 pandemic. Below are some of the images from the oil spill project, where a simulated image is processed and extracted.

In terms of technical knowledge, this experience was a treasure, starting from camera calibration, fixing radial and tangential distortions using methods such as Zhang’s and Bouguet’s method, developing GUI and visualization tools in python, translating coordinates from RGB to the thermal image of the same scene (which is a nice challenging research area for a PhD as it is with almost every other challenge). Adding to this knowledge, I have experimented with a lot of image processing techniques such as morphological transformations, adaptive mean thresholding, and adaptive gaussian thresholding.

In the image above we see parts of the thermal camera calibration process for COVID-19 project.

Exponential Learning Curve

The most challenging and rewarding part was the wide range of Artificial Neural Networks (ANNs) I have worked with, this included YOLOv4 for object and action recognition, ResNet 3D and Two-Stream Inflated 3D (I3D) CNN. This was the first time I knew about 3D CNNs and the ability to recognize Spatio-temporal features (Amazing!). Adding to the amazing, Two-Stream I3D CNN takes two inputs, a stream of RGB images and their equivalent optical flow. This was also the first time I deal with optical flow; optical flow is acquired by differentiating each RGB frame from the one after to extract the pixels where an action happened. That’s a lot of math, and it was the only code I had in C++, the rest of the machine learning part was all Python!

Below is an example of inferencing I3D CNN for COVID-19 RGB symptoms

COVID-19 symptoms are treated as a Human Activity Recognition (HAR) problem, which is a time series problem and many researchers rely on analysing sensor data such as accelerometers. Be that as it may, attaching a sensor such as an accelerometer to a large group of people is not feasible for large scale HAR. The alternative, in this case, is visual analysis. The human activity requires time to be executed; thus, adding the temporal dimension in Spatio-temporal three-dimension (3D) kernels. Guess what? 3D CNNs proved to be a challenge and demanding. You see the image right above? that is a scene in RGB, and its optical flow across both the x-axis and the y-axis, beautiful isn't it? it only highlights the pixels where activity is present. Math! Seriously a lot of it in the optical flow only.

This image above really simplifies the difference between a 2D and 3D Kernel for convoluting 2D and 3D CNNs. The complexity increase is exponential!

It Wasn't a Smooth Road!

You can see the calibration board on the laptop screen and the drone in the background, you might need to squeeze your eyes to see the drone, it blends in well.

Died for science!

When you need your tools the most :) if I remember this correctly, this happened when I messed up something while installing Ubuntu in dual mode with Windows 10. Alas, my laptop died while training YOLOv4 for around 40+ hours continuously when the GPU got burned. RIP 2013 - 2020, was a beast of a laptop.

Comes a new workstation to the rescue

My personal laptop's death marked the transition to a desktop PC, which a nice change. This machine carried almost all of the experiments mentioned above. Based on the diverse selection of ANNs as mentioned above, it is important to select an eco-system that can house the training and inferencing of these different architectures and the corresponding frameworks and libraries needed. This was housed in a Unix based system running Linux variant Ubuntu with hardware and software eco-system as detailed in the table below. This configuration was selected based on the recommendations in the compatibility matrices written by Google TensorFlow developers for GPU configuration, and Nvidia developers for CUDA framework and its toolkits.

Reflection

Being a research assistant is a one-of-a-kind experience, you are in the frontier, pioneering R&D. You get the chance to experiment, figure out what works and what doesn’t, document your results, and network with other researchers working on the same problem at international conferences. The best part? Your work gets to become a solution for someone else’s problem!

Dhandapani Ragavesh

Sr. Lecturer at College of Engineering

3 年

Nice to read your article... all the best for new system and new projects!

1 次回应

Ammar Al-Abdali

Supply Chain - In-country Value Analyst

3 年

Thanks for sharing knowledge

1 次回应

Vallavaraj Athinarayanan

Dean of College of Advanced Technology at National University of Science and Technology, Oman. A seasoned leader in the higher education industry, contributing to a legacy of academic excellence.

3 年

Enjoyed reading your article (experience)! Well narrated and sounds exciting! Congratulations and best wishes!

1 次回应

Aviral Goel

GPU Rendering Intern @ Samsung ACL | Seeking Software Engineering Jobs

3 年

Great work!

1 次回应

Ahmed Albadi, PhD

Co-founder & Chief AI Officer @Byanat | Researcher in AI & Data Analytics, Parallel Programming, HPC, GPU, Machine Health Monitoring

3 年

Very interesting, I think we can work on some applications using 3D CNNs and high-spec HPC. What kind of preprocessing have you implemented with 3D CNNs?

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

Working at the Frontier of Machine Learning and Computer Vision

Ahmed Alghadani

Co-founder & CEO @byanat | Building the technology that will enable better infrastructure in telecom, energy, and utilities. From cell towers to power grids.

Overview

Exponential Learning Curve

It Wasn't a Smooth Road!

Reflection

更多精彩文章

社区洞察

其他会员也浏览了

Hiring Computer Vision Engineers- a detailed guide

10 Actionable Steps for Using AI in Your Research Lab

Machine learning for computer vision and where are the limits? - Interview with Marek Tatara, PhD.

Techniques to make deep learning efficient: Pruning and Leverage Sparse Tensor Cores of A100

Road Sign Recognition Using Deep Learning and PyQt: A Detailed Guide

Autoencoders (AEs) vs. Variational Autoencoders (VAEs)

Face Recognition with VGG16 Transfer Learning

Techniques to make deep learning efficient: Pruning and Leverage Sparse Tensor Cores of A100

Variational autoencoders (VAEs)

May the force be with you Part-1 :Jet Tagging using Deep Learning

Overview

Exponential Learning Curve

It Wasn't a Smooth Road!

Reflection

The Unseen ROI of Embracing Failure

2024年5月31日

The REAL Startup Challenge: Connecting Tech to Customers

2024年5月15日

Pioneering the Future: Sanabil Investments' Bold Approach to Venture Capital and Global Innovation

2023年4月25日

Unlocking New Revenue Streams in the Energy Sector: A Path Forward for Telecom Operators with byanat

2023年4月18日

The Role of Network Performance Monitoring in Improving Connectivity

2023年4月16日

OpenAI: A Non-Traditional Investment in a World of Traditional Startups

2023年4月7日

A Chance to Make the Future!

2020年6月30日

社区洞察

其他会员也浏览了

Hiring Computer Vision Engineers- a detailed guide

10 Actionable Steps for Using AI in Your Research Lab

Machine learning for computer vision and where are the limits? - Interview with Marek Tatara, PhD.

Techniques to make deep learning efficient: Pruning and Leverage Sparse Tensor Cores of A100

Road Sign Recognition Using Deep Learning and PyQt: A Detailed Guide

Autoencoders (AEs) vs. Variational Autoencoders (VAEs)

Face Recognition with VGG16 Transfer Learning

Techniques to make deep learning efficient: Pruning and Leverage Sparse Tensor Cores of A100

Variational autoencoders (VAEs)

May the force be with you Part-1 :Jet Tagging using Deep Learning