Face Detector with VisionKit and SwiftUI
Face Detector with VisionKit and SwiftUI

Face Detector with VisionKit and SwiftUI

The algorithms are all provided out of the box by Apple's Vision and VisionKit Frameworks, and I have added the following capabilities to the sample app:

1.??Detect and visualise bounding box

2.??Detect and visualise face landmarks

3.??Determine image capture quality

4.??Determine head position

What is a Bounding Box?

The bounding box is used when we wish to monitor a person's entire face as a single object. We directly obtain these data from the detection algorithms, and we may use them to visualise, for instance, a green square surrounding the face.

The example app accomplishes just this. You can modify the bounding box's visual representation to suit your use case. Perhaps you require a different shade or dashed rather than solid lines? With SwiftUI and the way I separated the detection algorithm from its visualisation, this task is simple.

No alt text provided for this image

How do Face Landmarks work?

Face landmarks allow us to focus more specifically on individual elements of the face rather than seeing the entire face as a single unit. The sets of coordinates that represent face characteristics like the mouth, nose, eyes, and so forth are what we get from the detection methods.

No alt text provided for this image

Capture Quality: What is it?

The Capture Quality indicator gives a precise number showing the suitability of the captured photos for detection. The quality improves with increasing value. The range is from 0.0 to 1.0.

When you have a number of photos of the same topic, this is extremely helpful for choosing the best photo for further editing.

What variations of head positions are there?

The Head Position is an intriguing additional measure. In reality, we receive three metrics: one for roll, one for yaw, and one for head pitch. The image below depicts the difference of these values.

No alt text provided for this image

Knowing what the sample app performs in advance will help us understand how to organize it to use these capabilities.

?App Architecture and Code Organization:

The example app includes three major processing steps:

1.??Capturing an image sequence

2.??Running the detection algorithms

3.??Visualising the result

Nearly all of these stages are carried out by a single large class in an application that uses a "Massive View Controller" design pattern. This is often done in UIViewControllers in elegant UIKit projects, thus the name. I've already done that.

The primary objective of this project was to create a code structure that was simple and concise by classifying the concerns of capture, detection, and visualisation, and linking them using a pipeline-mechanism.

The project in Xcode is structure as follows:

No alt text provided for this image

FaceDetectorApp: Our application's entrance point, which is where the application delegate is held, is located here. The many functional components of the application are created and connected in the application delegate.

ContentView: The top-most view for the application.

CameraView:?For SwiftUI, there isn't a native camera view yet. As a result, we require this assistance class to cover the UIKit-native video preview layer.

CaptureSession: As implied by the name, this class is in charge of recording the video stream or image sequence.

FaceDetector: The magic takes place here! This class is used to invoke all detection algorithms.

AVCaptureVideoOrientation: AVCaptureVideoOrientation must be converted from UIDeviceOrientation to AVCaptureVideoOrientation in order for the proper visualization to occur.

Visit Us:?corp.infogen-labs.com?

Social Media:?Instagram?|?Facebook?|?LinkedIn?|?YouTube?|?Twitter

要查看或添加评论,请登录

Ciklum India的更多文章

社区洞察

其他会员也浏览了