How to do Face detection with dlib (HOG and CNN)

How to do Face detection with dlib (HOG and CNN)

Dlib is an open-source software library primarily written in C++, with Python bindings available. It provides a wide range of tools and algorithms for various machine learning, computer vision, and image processing tasks. Developed by Davis King, Dlib is widely known for its efficiency, portability, and ease of use. Face detection with dlib involves the use of two main methods: Histogram of Oriented Gradients (HOG) and Convolutional Neural Networks (CNN).

Introduction to HOG and CNN:

1. Histogram of Oriented Gradients (HOG): HOG is a feature descriptor technique used for object detection in computer vision. It works by calculating the distribution of gradient orientations in localized portions of an image. HOG breaks down an image into small, overlapping cells, computes histograms of gradient orientations within each cell, and then normalizes these histograms. The resulting feature vector represents the distribution of gradient orientations in the image, providing meaningful information about the local object shape and texture. HOG has been widely used in pedestrian detection, face detection, and other object recognition tasks.

2. Convolutional Neural Networks (CNN): CNNs are deep learning models specifically designed to process structured grids of data, such as images. They consist of multiple layers, including convolutional layers, pooling layers, and fully connected layers. CNNs automatically learn hierarchical patterns and features from raw pixel data, enabling them to effectively extract relevant features for various tasks, including face detection. CNNs have achieved remarkable success in computer vision tasks, surpassing traditional feature-based approaches in many cases.

Working Principle for Face Detection:

1. HOG-based Face Detection:

- Feature Extraction: Initially, the image is divided into small, overlapping cells, and gradient magnitudes and orientations are computed for each pixel within these cells.

- Histogram Calculation: Histograms of gradient orientations are constructed for each cell.

- Normalization: The histograms are normalized within blocks to ensure invariance to changes in lighting and contrast.

- Sliding Window Detection: A sliding window technique is employed to scan the entire image, applying the HOG descriptor to each window. At each position, a classifier (typically a linear SVM) is used to determine whether the window contains a face or not.

- Post-processing: Detected face regions may undergo additional refinement steps, such as non-maximum suppression to merge overlapping detections.

2. CNN-based Face Detection:

- Training: A CNN model is trained on a large dataset of labeled face images. During training, the CNN learns to automatically extract relevant features for face detection from raw pixel data.

- Feature Extraction: The input image is fed into the CNN, and feature maps are computed through convolutional and pooling layers.

- Detection: The final layers of the CNN typically consist of fully connected layers followed by softmax or sigmoid activation functions, which output the probability of the presence of a face in various regions of the image.

- Non-maximum Suppression: Similar to HOG-based detection, post-processing steps like non-maximum suppression may be applied to refine the detected face regions and eliminate overlapping detections.


Code:

import dlib
import cv2        

  • This code imports the required libraries dlib and cv2 (OpenCV).

##Detecting faces with HOG (Histogram of Oriented Gradients)

image = cv2.imread('Images/istockphoto-1362120018-612x612.jpg')        

  • This line reads an image named 'istockphoto-1362120018-612x612.jpg'.

#To Display image
if image is not None:
    # Display the image
    cv2.imshow('image', image)

    # Wait for a key press and close the window
    cv2.waitKey(0)
    cv2.destroyAllWindows()
else:
    print("Error: Unable to read the image.")        

  • This block of code checks if the image is successfully loaded. If it is, it displays the image in a window titled 'image' using cv2.imshow().
  • Then it waits for a key press (cv2.waitKey(0)) and closes all OpenCV windows when any key is pressed (cv2.destroyAllWindows()).
  • If the image loading fails, it prints an error message.


face_detector_hog = dlib.get_frontal_face_detector()        

  • This line initializes the face detector using the Histogram of Oriented Gradients (HOG) method. It returns a face detector object.

detections = face_detector_hog(image, 1)        

  • This line detects faces in the image using the detect() method of the face_detector_hog object.
  • The second parameter 1 indicates that the image will be upscaled once to potentially detect smaller faces.

for face in detections:
  #print(face)
  #print(face.left())
  #print(face.top())
  #print(face.right())
  #print(face.bottom())
  l, t, r, b = face.left(), face.top(), face.right(), face.bottom()
  cv2.rectangle(image, (l, t), (r, b), (0, 255, 0), 2)
          

  • This loop iterates over each detected face and draws a green rectangle around it on the original image.
  • It uses the coordinates of the detected face to draw the rectangle using cv2.rectangle().

#To Display image
if image is not None:
    # Display the image
    cv2.imshow('image', image)

    # Wait for a key press and close the window
    cv2.waitKey(0)
    cv2.destroyAllWindows()
else:
    print("Error: Unable to read the image.")        

  • It displays the image with the detected faces and waits for a key press to close the window. If the image loading fails, it prints an error message.

##Detecting faces with CNN (Convolutional Neural Networks)

image = cv2.imread('Images/istockphoto-1362120018-612x612.jpg')
cnn_detector = dlib.cnn_face_detection_model_v1('Weights/mmod_human_face_detector.dat')        

  • This line reads an image named 'istockphoto-1362120018-612x612.jpg' using cv2.imread().
  • It also initializes the Convolutional Neural Network (CNN) based face detector using the dlib.cnn_face_detection_model_v1() function. The model file 'mmod_human_face_detector.dat' is provided as input to load the pre-trained model.

detections = cnn_detector(image, 1)
for face in detections:
  l, t, r, b, c = face.rect.left(), face.rect.top(), face.rect.right(), face.rect.bottom(), face.confidence
  print(c)
  cv2.rectangle(image, (l, t), (r, b), (255, 255, 0), 2)        

  • This line detects faces in the image using the CNN face detector initialized in the previous step.
  • The second parameter 1 indicates that the image will be upscaled once to potentially detect smaller faces.
  • This loop iterates over each detected face and draws a blue rectangle around it on the original image.
  • It also prints the confidence score (c) associated with each detected face.
  • It uses the coordinates of the detected face to draw the rectangle using cv2.rectangle().

#To Display image
if image is not None:
    # Display the image
    cv2.imshow('image', image)

    # Wait for a key press and close the window
    cv2.waitKey(0)
    cv2.destroyAllWindows()
else:
    print("Error: Unable to read the image.")        

  • This code block is similar to the ones seen before. It displays the image with the detected faces and waits for a key press to close the window. If the image loading fails, it prints an error message.


In summary, both HOG and CNN approaches aim to detect faces in images, but they differ in their feature extraction and classification methodologies. HOG relies on handcrafted features and traditional machine learning classifiers, while CNNs automatically learn features from data, making them more adaptable to different datasets and potentially achieving higher accuracy. Dlib provides implementations for both methods, allowing users to choose the one that best suits their requirements and constraints.


Full Code: https://github.com/TejasShastrakar/Computer_Vision.git



Hafsa Moontari Ali

Data Science, Machine/Deep learning enthusiastic || Looking for collaborations

2 个月

Shouldn't face frontalization be performed prior to applying dlib?

回复

要查看或添加评论,请登录

Tejas Shastrakar的更多文章

社区洞察

其他会员也浏览了