Role of OCR in machine learning
https://towardsdatascience.com/a-gentle-introduction-to-ocr-ee1469a201aa

Role of OCR in machine learning

Why to read this?

OCR is used for recognizing street signs (Google Street View) and searching through photos (Dropbox). If you like to know its working, this document helps.


Technical explanation

OCR, or optical character recognition, is one of the earliest addressed computer vision tasks, since in some aspects it does not require deep learning. Below is the data-set of house numbers extracted from google street view.


No alt text provided for this image




OCR flow

Below flow talks how an image is processed for extracting characters

No alt text provided for this image


OCR and CAPTCHA


Most of today text CAPTCHAs are not very hard to solve, especially if we don’t try to solve all of them at once.

No alt text provided for this image


Applications
  • Capturing text from moving object. For example, capturing vehicle number
  • Capturing texts in street view (Ref Google street view)
  • Extracting text from PDF
  • Digitisation of hand written books (Refer Mnist)


Challenges in OCR
  • Variety of letters:  Letter orms in some alphabets are harder to recognize. For example, as even the printed Arabic characters are in the cursive form, character recognition becomes a challenge.
  • Variety of font types & sizes
  • Look-alike characters - For example, it is hard to differentiate between the number “0” and the letter “O”
  • Handwritten text


OCR evolution
No alt text provided for this image


ML algorithms


CRNN

It uses convolutional neural networks

No alt text provided for this image


STN-net/SEE

No alt text provided for this image


EAST

EAST, or Efficient and Accurate Scene Text Detector, is a deep learning model for detecting text from natural scene images


Python libraries

Refer here for OpenCV python library


Reference
Thanks to these helping hands
https://youtu.be/GA35F3N3i_I
https://mobidev.biz/blog/ocr-machine-learning-implementation

https://towardsdatascience.com/a-gentle-introduction-to-ocr-ee1469a201aa

https://medium.com/syncedreview/stn-ocr-a-single-neural-network-for-text-detection-and-text-recognition-220debe6ded4

https://research.aimultiple.com/ocr-technology/

https://images.app.goo.gl/85ABhFcLgnnxcu9LA

https://images.app.goo.gl/5ex2vHXavJmaRTW58

要查看或添加评论,请登录

Deepak Kumar的更多文章

  • Role of DBSCAN in machine learning

    Role of DBSCAN in machine learning

    Why to read this? Density-based spatial clustering of applications with noise (DBSCAN)is a well-known data clustering…

  • Choice between multithreading and multi-processing: When to use what

    Choice between multithreading and multi-processing: When to use what

    Introduction Single threaded and single process solution is normal practice. For example, if you open the text editor…

  • Artificial Narrow Intelligence

    Artificial Narrow Intelligence

    About ANI ANI stands for "Artificial Narrow Intelligence." ANI refers to artificial intelligence systems that are…

  • Federated learning and Vehicular IoT

    Federated learning and Vehicular IoT

    Definition Federated Learning is a machine learning paradigm that trains an algorithm across multiple decentralised…

  • An age old proven technique for image resizing

    An age old proven technique for image resizing

    Why to read? Anytime, was you curious to know how you are able to zoom small resolution picture to bigger size?…

    1 条评论
  • Stock Market Volatility Index

    Stock Market Volatility Index

    Why? Traders and investors use the VIX index as a tool to gauge market sentiment and assess risk levels. It can help…

  • The case for De-normalisation in Machine learning

    The case for De-normalisation in Machine learning

    Why? The need for inverse normalization arises when you want to interpret or use the normalized data in its original…

    1 条评论
  • Kubernetes complements Meta-verse

    Kubernetes complements Meta-verse

    Motivation The #metaverse is a virtual world or space that exists on the #internet . It's like a big interconnected…

    1 条评论
  • Which one offers better Security- OSS or Proprietary software

    Which one offers better Security- OSS or Proprietary software

    Motivation World is using so many OSS. Apache Kafka is a core part of our infrastructure at LinkedIn Redis is core part…

  • Why chatGPT/LLM should have unlearning capability like human has..

    Why chatGPT/LLM should have unlearning capability like human has..

    Executive Summary Do you know, chatGPT/LLM has this open problem to solve. This problem(unlearn) has potential to…

    1 条评论

社区洞察

其他会员也浏览了