Real-time Face detection
What?is Face Detection?
The goal of face detection is to determine if there are any faces in the image or video. If multiple faces are present, each face is enclosed by a bounding box and thus we know the location of the faces
Human faces are difficult to model as there are many variables that can change for example facial expression, orientation, lighting conditions and partial occlusions such as sunglasses, scarf, mask etc. The result of the detection gives the face location parameters and it could be required in various forms, for instance, a rectangle covering the central part of the face, eye centers or landmarks including eyes, nose and mouth corners, eyebrows, nostrils, etc.
Face Detection Methods
There are two main approaches for Face Detection:
Feature Base Approach
Objects are usually recognized by their unique features. There are many features in a human face, which can be recognized between a face and many other objects. It locates faces by extracting structural features like eyes, nose, mouth etc. and then uses them to detect a face. Typically, some sort of statistical classifier qualified then helpful to separate between facial and non-facial regions. In addition, human faces have particular textures which can be used to differentiate between a face and other objects. Moreover, the edge of features can help to detect the objects from the face. In the coming section, we will implement a feature-based approach by using OpenCV.
Image Base Approach
In general, Image-based methods rely on techniques from statistical analysis and machine learning to find the relevant characteristics of face and non-face images. The learned characteristics are in the form of distribution models or discriminant functions that is consequently used for face detection. In this method, we use different algorithms such as Neural-networks, HMM, SVM, AdaBoost learning. In the coming section, we will see how we can detect faces with MTCNN or Multi-Task Cascaded Convolutional Neural Network, which is an Image-based approach of face detection
Face detection algorithm
One of the popular algorithms that use a feature-based approach is the Viola-Jones algorithm and here I am briefly going to discuss it. If you want to know about it in detail, I would suggest going through this article,?Face Detection using Viola Jones Algorithm.
Viola-Jones?algorithm is named after two computer vision researchers who proposed the method in 2001, Paul?Viola?and Michael?Jones?in their paper, “Rapid Object Detection using a Boosted Cascade of Simple Features”. Despite being an outdated framework,?Viola-Jones?is quite powerful, and its application has proven to be exceptionally notable in real-time face detection. This algorithm is painfully slow to train but can detect faces in real-time with impressive speed.
Given an image(this algorithm works on grayscale image), the algorithm looks at many smaller sub regions and tries to find a face by looking for specific features in each subregion. It needs to check many different positions and scales because an image can contain many faces of various sizes.?Viola?and?Jones?used Haar-like features to detect faces in this algorithm.
Face Recognition
Face detection and Face Recognition are often used interchangeably but these are quite different. In fact, Face detection is just part of Face Recognition.
Face recognition is a method of identifying or verifying the identity of an individual using their face. There are various algorithms that can do face recognition but their accuracy might vary. Here I am going to describe how we do face recognition using deep learning.
In fact here is an article,?Face Recognition Python?which shows how to implement Face Recognition.
Face Detection using OpenCV
In this section, we are going to use OpenCV to do real-time face detection from a live stream via our webcam.
As you know videos are basically made up of frames, which are still images. We perform the face detection for each frame in a video. So when it comes to detecting a face in still image and detecting a face in a real-time video stream, there is not much difference between them.
We will be using Haar Cascade algorithm, also known as Voila-Jones algorithm to detect faces. It is basically a machine learning object detection algorithm which is used to identify objects in an image or video. In OpenCV, we have several trained?Haar Cascade models which are saved as XML files. Instead of creating and training the model from scratch, we use this file. We are going to use “haarcascade_frontalface_alt2.xml” file in this project. Now let us start coding this up
The first step is to find the path to the “haarcascade_frontalface_alt2.xml” file. We do this by using the os module of Python language.
import os
cascPath = os.path.dirname(
cv2.__file__) + "/data/haarcascade_frontalface_alt2.xml"
The next step is to load our classifier. The path to the above XML file goes as an argument to CascadeClassifier() method of OpenCV.
faceCascade = cv2.CascadeClassifier(cascPath)
After loading the classifier, let us open the webcam using this simple OpenCV one-liner code
video_capture = cv2.VideoCapture(0)
Next, we need to get the frames from the webcam stream, we do this using the read() function. We use it in infinite loop to get all the frames until the time we want to close the stream.
while True:
# Capture frame-by-frame
ret, frame = video_capture.read()
The read() function returns:
The return code tells us if we have run out of frames, which will happen if we are reading from a file. This doesn’t matter when reading from the webcam since we can record forever, so we will ignore it.
For this specific classifier to work, we need to convert the frame into greyscale.
领英推荐
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
The faceCascade object has a method detectMultiScale(), which receives a frame(image) as an argument and runs the classifier cascade over the image. The term MultiScale indicates that the algorithm looks at subregions of the image in multiple scales, to detect faces of varying sizes.
faces = faceCascade.detectMultiScale(gray,
scaleFactor=1.1,
minNeighbors=5,
minSize=(60, 60),
flags=cv2.CASCADE_SCALE_IMAGE)
Let us go through these arguments of this function:
The variable faces now contain all the detections for the target image. Detections are saved as pixel coordinates. Each detection is defined by its top-left corner coordinates and width and height of the rectangle that encompasses the detected face.
To show the detected face, we will draw a rectangle over it.OpenCV’s rectangle() draws rectangles over images, and it needs to know the pixel coordinates of the top-left and bottom-right corner. The coordinates indicate the row and column of pixels in the image. We can easily get these coordinates from the variable face.
for (x,y,w,h) in faces:
cv2.rectangle(frame, (x, y), (x + w, y + h),(0,255,0), 2)
rectangle() accepts the following arguments:
Next, we just display the resulting frame and also set a way to exit this infinite loop and close the video feed. By pressing the ‘q’ key, we can exit the script here
cv2.imshow('Video', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
The next two lines are just to clean up and release the picture.
video_capture.release()
cv2.destroyAllWindows()
Here are the full code and output.
import cv2
import os
cascPath = os.path.dirname(
????cv2.__file__) + "/data/haarcascade_frontalface_alt2.xml"
faceCascade = cv2.CascadeClassifier(cascPath)
video_capture = cv2.VideoCapture(0)
while True:
????# Capture frame-by-frame
????ret, frame = video_capture.read()
????gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
????faces = faceCascade.detectMultiScale(gray,
?????????????????????????????????????????scaleFactor=1.1,
?????????????????????????????????????????minNeighbors=5,
minSize=(60, 60),
flags=cv2.CASCADE_SCALE_IMAGE)
for (x,y,w,h) in faces:
????????cv2.rectangle(frame, (x, y), (x + w, y + h),(0,255,0), 2)
????????# Display the resulting frame
????cv2.imshow('Video', frame)
????if cv2.waitKey(1) & 0xFF == ord('q'):
????????break
video_capture.release()
cv2.destroyAllWindows()
??????????????????????????????????????
How to do Real-time detection?
We can modify the code in the first section so that it can detect faces. We will detect the frames with faces using the methods as shown in the first section and then pass them to our model after preprocessing them.?
import cv2
import os
from tensorflow.keras.preprocessing.image import img_to_array
from tensorflow.keras.models import load_model
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
import numpy as np
The first few lines are exactly the same as the first section. The only thing that is different is that we have assigned our pre-trained mask detector model to the variable model.
ascPath = os.path.dirname(
cv2.__file__) + "/data/haarcascade_frontalface_alt2.xml"
faceCascade = cv2.CascadeClassifier(cascPath)
model = load_model("mask_recog1.h5")
video_capture = cv2.VideoCapture(0)
while True:
# Capture frame-by-frame
ret, frame = video_capture.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = faceCascade.detectMultiScale(gray,
scaleFactor=1.1,
minNeighbors=5,
minSize=(60, 60),
flags=cv2.CASCADE_SCALE_IMAGE)
Next, we define some lists. The faces_list contains all the faces that are detected by the faceCascade model .
faces_list=[]
Also since the faces variable contains the top-left corner coordinates, height and width of the rectangle encompassing the faces, we can use that to get a frame of the face and then preprocess that frame so that it can be fed into the model for prediction. The preprocessing steps are same that are followed when training the model in the second section. For example, the model is trained on RGB images so we convert the image into RGB here.
for (x, y, w, h) in faces:
face_frame = frame[y:y+h,x:x+w]
face_frame = cv2.cvtColor(face_frame, cv2.COLOR_BGR2RGB)
face_frame = cv2.resize(face_frame, (224, 224))
face_frame = img_to_array(face_frame)
face_frame = np.expand_dims(face_frame, axis=0)
face_frame = preprocess_input(face_frame)
faces_list.append(face_frame)
if len(faces_list)>0:
preds = model.predict(faces_list)
After getting the predictions, we draw a rectangle over the face .
color = (0, 255, 0)
label = "{}: {:.2f}%".format(label)
cv2.putText(frame, label, (x, y- 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.45, color, 2)
cv2.rectangle(frame, (x, y), (x + w, y + h),color, 2)
The rest of the steps are the same as the first section.
cv2.imshow('Video', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
video_capture.release()
cv2.destroyAllWindows()
Output:
This brings us to the end of this article where we learned how to detect faces in real-time
Face detection on a movie character?