Extracting Text from Images Using Python: A Guide to OCR with?EasyOCR
Kevin Meneses
SFMC Consultant|SAP CX Senior Consultant |SAP Sales and Service Cloud|CPI|CDC|Qualtrics|Data Analyst and ETL|Marketing Automation|SAPMarketing Cloud and Emarsys
Introduction
Have you ever found yourself needing to extract text from an image?—?perhaps a street sign, a receipt, or a scanned document?—?but didn’t want to manually retype everything? Optical Character Recognition (OCR) is the solution to this problem, allowing you to convert images containing text into machine-readable formats. While Tesseract is a popular OCR tool, there are other powerful alternatives that can sometimes yield better results for more complex tasks.
In this article, we’ll explore how to extract text from images using EasyOCR, a Python-based OCR library that supports over 80 languages. EasyOCR is simpler to set up than Tesseract and performs better in some cases, particularly with images containing irregular fonts or complex layouts.
Let’s dive into what OCR is, the advantages of using EasyOCR, and how to implement it in Python.
What is?OCR?
OCR, or Optical Character Recognition, is the process of identifying and extracting text from images. It’s widely used in various applications, such as:
Advantages of Using?EasyOCR
EasyOCR offers several benefits compared to traditional OCR tools like Tesseract:
Getting Started with EasyOCR in?Python
To start using EasyOCR, you’ll first need to install the library. Here’s how to set it up:
Step 1: Install EasyOCR and Other Dependencies
You can install EasyOCR using pip:
pip install easyocr
Additionally, you’ll need torch (PyTorch) as EasyOCR is built on top of it:
pip install torch torchvision
Step 2: Writing Python Code to Extract Text from?Images
Let’s write a simple Python script to load an image and extract text using EasyOCR.
import easyocr
import cv2
import matplotlib.pyplot as plt
# Inicializamos el lector de EasyOCR
reader = easyocr.Reader(['es']) # Cambia el idioma si es necesario
# Ruta de la imagen
image_path = r'C:\Users\kevin\OneDrive\Desktop\youtube_scripts\Copia de Curso Inversion en Bolsa (1).png'
# Cargamos la imagen
image = cv2.imread(image_path)
# Realizamos la detección de texto en la imagen
results = reader.readtext(image_path)
# Mostramos los resultados en la terminal
for (bbox, text, prob) in results:
print(f"Texto detectado: {text} con confianza {prob:.4f}")
# Anotamos la imagen con cajas delimitadoras
for (bbox, text, prob) in results:
# Desempaquetamos la caja delimitadora
top_left = tuple([int(val) for val in bbox[0]])
bottom_right = tuple([int(val) for val in bbox[2]])
# Dibujamos el rectángulo
cv2.rectangle(image, top_left, bottom_right, (0, 255, 0), 2)
# Anotamos el texto en la imagen
cv2.putText(image, text, (top_left[0], top_left[1] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)
# Mostramos la imagen anotada
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()
Explanation of the?Code
领英推荐
Example Output
Here’s what the output looks like after running the code:
Customizing EasyOCR for Better?Results
EasyOCR can be customized in a few different ways to improve results depending on your specific use case.
reader = easyocr.Reader(['en', 'es'])
2. Confidence Threshold: If you’re only interested in highly confident predictions, you can filter out results based on the confidence score.
for (bbox, text, prob) in results:
if prob > 0.7: # Only display results with confidence greater than 70%
print(f"Detected text: {text} with confidence {prob:.4f}")
3. Improving Accuracy with Image Preprocessing: Like other OCR tools, EasyOCR benefits from clean, high-contrast images. You can preprocess your images (e.g., by converting to grayscale or increasing contrast) to improve the accuracy of the OCR results.
# Convert image to grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Apply thresholding to make the text stand out
_, binary_image = cv2.threshold(gray_image, 150, 255, cv2.THRESH_BINARY)
Results
Handling Challenges with?OCR
OCR is a powerful tool, but it’s not without its challenges:
EasyOCR generally performs well with text in multiple fonts and complex layouts, but preprocessing steps such as binarization or image sharpening can further improve the results in difficult cases.
Conclusion
EasyOCR is a simple yet powerful tool for extracting text from images in Python. With its ability to handle multiple languages and complex layouts, it provides an excellent alternative to more traditional OCR tools like Tesseract. By integrating it into your workflow, you can automate the process of text extraction from images, saving both time and effort.
Whether you’re working on digitizing paper records, analyzing street signs, or extracting text from screenshots, EasyOCR can help you get the job done. Try it out today and see how it can transform the way you handle image-based text!
?Are you looking for a data consultant?