Open Source library for detect image faults
Sankalp Varshney
Computer Vision Researcher @Siemens | A.I & D.L | Cassandra | Tensorflow | Edge Devices | Ex Efkon | Ex C-DAC
In the field of Computer Vision, the most challenging and time-consuming task is image validation and detecting issues within the image dataset before training a deep learning model. Detecting issues within the image dataset typically requires human intervention or manual inspection. However, when dealing with a large dataset, this process can be time-consuming and demanding.
Now, with the assistance of the CleanVision library, we can automate this process and complete it in significantly less time.
The CleanVision library help to detect following issues in image dataset :
As we all know, if our training image dataset contains defects, it can adversely impact the performance of our deep learning model, leading to poor results.
By adding just 4-5 lines of code, we can automate this task and significantly reduce the manual effort required for data validation and defect check within the team.
from cleanvision.imagelab import Imagelab
# Specify path to folder containing the image files in your dataset
imagelab = Imagelab(data_path="FOLDER_WITH_IMAGES/")
# Automatically check for a predefined list of issues within your dataset
imagelab.find_issues()
# Produce a neat report of the issues found in your dataset
imagelab.report()
I have also invested time in exploring this library for detecting defects in image datasets, specifically focusing on datasets from Hugging Face and TorchVision. I meticulously documented my experiments in a Jupyter notebook. The complete project, including the Jupyter notebook, can be found at the following link.
The credit for this remarkable invention goes to CleanLab, which has developed an extraordinary solution.
Machine Learning Engineer/Deep Learning/Computer Vision/Data Scientist
1 年Thank you for sharing