登录查看更多内容

Open Source library for detect image faults

Sankalp Varshney

Computer Vision Researcher @Siemens | A.I & D.L | Cassandra | Tensorflow | Edge Devices | Ex Efkon | Ex C-DAC

发布日期: 2023年5月7日

In the field of Computer Vision, the most challenging and time-consuming task is image validation and detecting issues within the image dataset before training a deep learning model. Detecting issues within the image dataset typically requires human intervention or manual inspection. However, when dealing with a large dataset, this process can be time-consuming and demanding.

Now, with the assistance of the CleanVision library, we can automate this process and complete it in significantly less time.

The CleanVision library help to detect following issues in image dataset :

Dark colour or low light images
High or sharp light images
Blurry images
Odd Aspect ratio images
Low information images
Exact duplicate images
Nearly duplicate images
Gray Scale images

As we all know, if our training image dataset contains defects, it can adversely impact the performance of our deep learning model, leading to poor results.

By adding just 4-5 lines of code, we can automate this task and significantly reduce the manual effort required for data validation and defect check within the team.

from cleanvision.imagelab import Imagelab

# Specify path to folder containing the image files in your dataset
imagelab = Imagelab(data_path="FOLDER_WITH_IMAGES/")

# Automatically check for a predefined list of issues within your dataset
imagelab.find_issues()

# Produce a neat report of the issues found in your dataset
imagelab.report()

I have also invested time in exploring this library for detecting defects in image datasets, specifically focusing on datasets from Hugging Face and TorchVision. I meticulously documented my experiments in a Jupyter notebook. The complete project, including the Jupyter notebook, can be found at the following link.

https://github.com/sankalpvarshney/cleanvision

The credit for this remarkable invention goes to CleanLab, which has developed an extraordinary solution.

sulaiman mahmoud

Machine Learning Engineer/Deep Learning/Computer Vision/Data Scientist

1 年

Thank you for sharing

1 次回应

查看更多评论

要查看或添加评论，请登录

Sankalp Varshney的更多文章

Day 02 Basics of Sequential Modelling , NLP and Large Language Models(LLM)

2023年5月19日

Day 02 Basics of Sequential Modelling , NLP and Large Language Models(LLM)

Bi-directional Recurrent Neural Network (bi-RNN) is the upgraded and more enhanced version of RNN. A bidirectional…

1 条评论
Day 01 Basics of Sequential Modelling , NLP and Large Language Models(LLM)

2023年5月15日

Day 01 Basics of Sequential Modelling , NLP and Large Language Models(LLM)

Recurrent neural networks (RNN) is the basic unit of sequential data learning.It is a type of artificial neural network…

2 条评论
Advanced Vehicle Tracking and Detection System using ByteTrack, Supervision, and YOLO Algorithms

2023年5月11日

Advanced Vehicle Tracking and Detection System using ByteTrack, Supervision, and YOLO Algorithms

With the assistance of ByteTrack, supervision, and YOLO v8 algorithms, I have developed a system that efficiently…

6 条评论
Object Detection and Region-Based Counting with Supervision Library and YOLO Algorithm

2023年5月10日

Object Detection and Region-Based Counting with Supervision Library and YOLO Algorithm

With the support of a supervision library, we can effortlessly detect and count objects based on their respective…

1 条评论
YOLO-NAS

2023年5月3日

YOLO-NAS

YOLO-NAS architecture is out! The new YOLO-NAS delivers state-of-the-art performance with the unparalleled…

3 条评论
Data drift

2023年4月23日

Data drift

Now Data drift is becoming a common challenge whether you are using Machine Learning or Deep Learning to solve the…

See all articles

Open Source library for detect image faults

Sankalp Varshney

Computer Vision Researcher @Siemens | A.I & D.L | Cassandra | Tensorflow | Edge Devices | Ex Efkon | Ex C-DAC

Sankalp Varshney的更多文章

社区洞察

其他会员也浏览了

Black and white boxes: explaining the maths of machine learning

Artificial Intelligence No 30: How to understand the maths for data science – part two

New Books and Resources for DSC Members

Running a deep learning model in ArcGIS Pro with PyTorch with minimal requirements and no advanced extensions!!!

Building powerful image classification models using very little data

Unveiling the Enigma: An Introduction to the Mathematics of Machine Learning

Demystifying the Machine: Essential Skills for Machine Learning

You Just Need To Follow This To Become A Machine Learning Engineer in 2024:

DataOps to take the edge off performing analyses and discovering Data insights using complex mathematical functions

Command line tools for Machine learning

Sankalp Varshney的更多文章

Day 02 Basics of Sequential Modelling , NLP and Large Language Models(LLM)

Day 01 Basics of Sequential Modelling , NLP and Large Language Models(LLM)

Advanced Vehicle Tracking and Detection System using ByteTrack, Supervision, and YOLO Algorithms

Object Detection and Region-Based Counting with Supervision Library and YOLO Algorithm

YOLO-NAS

Data drift

社区洞察

其他会员也浏览了

Black and white boxes: explaining the maths of machine learning

Artificial Intelligence No 30: How to understand the maths for data science – part two

New Books and Resources for DSC Members

Running a deep learning model in ArcGIS Pro with PyTorch with minimal requirements and no advanced extensions!!!

Building powerful image classification models using very little data

Unveiling the Enigma: An Introduction to the Mathematics of Machine Learning

Demystifying the Machine: Essential Skills for Machine Learning

You Just Need To Follow This To Become A Machine Learning Engineer in 2024:

DataOps to take the edge off performing analyses and discovering Data insights using complex mathematical functions

Command line tools for Machine learning