Extracting Text from Images Using Python: A Guide to OCR with?EasyOCR

Extracting Text from Images Using Python: A Guide to OCR with?EasyOCR

Introduction

Have you ever found yourself needing to extract text from an image?—?perhaps a street sign, a receipt, or a scanned document?—?but didn’t want to manually retype everything? Optical Character Recognition (OCR) is the solution to this problem, allowing you to convert images containing text into machine-readable formats. While Tesseract is a popular OCR tool, there are other powerful alternatives that can sometimes yield better results for more complex tasks.

In this article, we’ll explore how to extract text from images using EasyOCR, a Python-based OCR library that supports over 80 languages. EasyOCR is simpler to set up than Tesseract and performs better in some cases, particularly with images containing irregular fonts or complex layouts.

Let’s dive into what OCR is, the advantages of using EasyOCR, and how to implement it in Python.

What is?OCR?

OCR, or Optical Character Recognition, is the process of identifying and extracting text from images. It’s widely used in various applications, such as:

  • Digitizing paper documents like invoices, receipts, or books.
  • Extracting text from street signs or license plates for autonomous vehicles.
  • Automating data entry for scanned forms or documents.

Advantages of Using?EasyOCR

EasyOCR offers several benefits compared to traditional OCR tools like Tesseract:

  1. Supports over 80 languages: This includes complex scripts like Chinese, Japanese, Korean, and Arabic.
  2. Better accuracy: Especially when dealing with distorted or handwritten text.
  3. Easy to set up: Unlike Tesseract, which requires you to install external binaries, EasyOCR can be installed and run directly via Python.
  4. Handles complex layouts: EasyOCR can recognize text in various fonts, sizes, and orientations, making it ideal for documents with mixed formats.

Getting Started with EasyOCR in?Python

To start using EasyOCR, you’ll first need to install the library. Here’s how to set it up:

Step 1: Install EasyOCR and Other Dependencies

You can install EasyOCR using pip:

pip install easyocr        

Additionally, you’ll need torch (PyTorch) as EasyOCR is built on top of it:

pip install torch torchvision        

Step 2: Writing Python Code to Extract Text from?Images

Let’s write a simple Python script to load an image and extract text using EasyOCR.

import easyocr
import cv2
import matplotlib.pyplot as plt

# Inicializamos el lector de EasyOCR
reader = easyocr.Reader(['es'])  # Cambia el idioma si es necesario

# Ruta de la imagen
image_path = r'C:\Users\kevin\OneDrive\Desktop\youtube_scripts\Copia de Curso Inversion en Bolsa (1).png'

# Cargamos la imagen
image = cv2.imread(image_path)

# Realizamos la detección de texto en la imagen
results = reader.readtext(image_path)

# Mostramos los resultados en la terminal
for (bbox, text, prob) in results:
    print(f"Texto detectado: {text} con confianza {prob:.4f}")

# Anotamos la imagen con cajas delimitadoras
for (bbox, text, prob) in results:
    # Desempaquetamos la caja delimitadora
    top_left = tuple([int(val) for val in bbox[0]])
    bottom_right = tuple([int(val) for val in bbox[2]])
    
    # Dibujamos el rectángulo
    cv2.rectangle(image, top_left, bottom_right, (0, 255, 0), 2)
    
    # Anotamos el texto en la imagen
    cv2.putText(image, text, (top_left[0], top_left[1] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)

# Mostramos la imagen anotada
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()        

Explanation of the?Code

  1. EasyOCR Reader: We initialize the EasyOCR reader by loading the English model. You can also specify multiple languages (e.g., ['en', 'fr', 'es']).
  2. Loading the Image: We load an image using opencv (cv2).
  3. Text Detection: The reader.readtext() function processes the image and returns a list of detected text, along with its bounding box and confidence score.
  4. Annotating the Image: We loop through the results and draw bounding boxes around the detected text. We also annotate the image with the extracted text.
  5. Displaying the Image: Using matplotlib, we visualize the image with the detected text marked.

Example Output

Here’s what the output looks like after running the code:

  • The terminal will display the detected text and the confidence score for each piece of text in the image.
  • The image will be displayed with bounding boxes drawn around the text, making it easy to visualize what the OCR tool has recognized.

Customizing EasyOCR for Better?Results

EasyOCR can be customized in a few different ways to improve results depending on your specific use case.

  1. Multiple Languages: You can pass multiple languages to the reader, such as ['en', 'es'], to handle multilingual text. For example, if your images contain both English and Spanish text, EasyOCR can recognize both simultaneously.

reader = easyocr.Reader(['en', 'es'])        

2. Confidence Threshold: If you’re only interested in highly confident predictions, you can filter out results based on the confidence score.

for (bbox, text, prob) in results:
    if prob > 0.7:  # Only display results with confidence greater than 70%
        print(f"Detected text: {text} with confidence {prob:.4f}")        

3. Improving Accuracy with Image Preprocessing: Like other OCR tools, EasyOCR benefits from clean, high-contrast images. You can preprocess your images (e.g., by converting to grayscale or increasing contrast) to improve the accuracy of the OCR results.

# Convert image to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Apply thresholding to make the text stand out
_, binary_image = cv2.threshold(gray_image, 150, 255, cv2.THRESH_BINARY)        

Results

Handling Challenges with?OCR

OCR is a powerful tool, but it’s not without its challenges:

  1. Varied Image Formats: Text in different fonts, orientations, and styles can make it harder for OCR engines to detect accurately.
  2. Complex Layouts: Documents that mix text with images, tables, or graphs can confuse OCR engines, leading to lower accuracy.
  3. Image Quality: Poor lighting, noise, or low resolution in images can hinder OCR performance.

EasyOCR generally performs well with text in multiple fonts and complex layouts, but preprocessing steps such as binarization or image sharpening can further improve the results in difficult cases.

Conclusion

EasyOCR is a simple yet powerful tool for extracting text from images in Python. With its ability to handle multiple languages and complex layouts, it provides an excellent alternative to more traditional OCR tools like Tesseract. By integrating it into your workflow, you can automate the process of text extraction from images, saving both time and effort.

Whether you’re working on digitizing paper records, analyzing street signs, or extracting text from screenshots, EasyOCR can help you get the job done. Try it out today and see how it can transform the way you handle image-based text!


?Are you looking for a data consultant?

Write me https://www.dhirubhai.net/in/kevin-meneses-897a28127/

要查看或添加评论,请登录

Kevin Meneses的更多文章

社区洞察

其他会员也浏览了