登录查看更多内容

Face Detector with VisionKit and SwiftUI

Ciklum India

We develop Digital Solutions for Fortune 500 and fast-growing organisations alike around the world.

发布日期: 2022年8月29日

The algorithms are all provided out of the box by Apple's Vision and VisionKit Frameworks, and I have added the following capabilities to the sample app:

1.??Detect and visualise bounding box

2.??Detect and visualise face landmarks

3.??Determine image capture quality

4.??Determine head position

What is a Bounding Box?

The bounding box is used when we wish to monitor a person's entire face as a single object. We directly obtain these data from the detection algorithms, and we may use them to visualise, for instance, a green square surrounding the face.

The example app accomplishes just this. You can modify the bounding box's visual representation to suit your use case. Perhaps you require a different shade or dashed rather than solid lines? With SwiftUI and the way I separated the detection algorithm from its visualisation, this task is simple.

How do Face Landmarks work?

Face landmarks allow us to focus more specifically on individual elements of the face rather than seeing the entire face as a single unit. The sets of coordinates that represent face characteristics like the mouth, nose, eyes, and so forth are what we get from the detection methods.

Capture Quality: What is it?

The Capture Quality indicator gives a precise number showing the suitability of the captured photos for detection. The quality improves with increasing value. The range is from 0.0 to 1.0.

When you have a number of photos of the same topic, this is extremely helpful for choosing the best photo for further editing.

What variations of head positions are there?

The Head Position is an intriguing additional measure. In reality, we receive three metrics: one for roll, one for yaw, and one for head pitch. The image below depicts the difference of these values.

领英推荐

SuperMap Hi-Fi 3D SDK for Unity Configuration

Evelyn Sun 1 年前

The future is natural - the journey from GUI to NUI

Marco van Hurne 11 个月前

TensorFlow.js Monthly #5: Did you get left behind?…

Jason Mayes 2 年前

Knowing what the sample app performs in advance will help us understand how to organize it to use these capabilities.

?App Architecture and Code Organization:

The example app includes three major processing steps:

1.??Capturing an image sequence

2.??Running the detection algorithms

3.??Visualising the result

Nearly all of these stages are carried out by a single large class in an application that uses a "Massive View Controller" design pattern. This is often done in UIViewControllers in elegant UIKit projects, thus the name. I've already done that.

The primary objective of this project was to create a code structure that was simple and concise by classifying the concerns of capture, detection, and visualisation, and linking them using a pipeline-mechanism.

The project in Xcode is structure as follows:

FaceDetectorApp: Our application's entrance point, which is where the application delegate is held, is located here. The many functional components of the application are created and connected in the application delegate.

ContentView: The top-most view for the application.

CameraView:?For SwiftUI, there isn't a native camera view yet. As a result, we require this assistance class to cover the UIKit-native video preview layer.

CaptureSession: As implied by the name, this class is in charge of recording the video stream or image sequence.

FaceDetector: The magic takes place here! This class is used to invoke all detection algorithms.

AVCaptureVideoOrientation: AVCaptureVideoOrientation must be converted from UIDeviceOrientation to AVCaptureVideoOrientation in order for the proper visualization to occur.

Visit Us:?corp.infogen-labs.com?

Social Media:?Instagram?|?Facebook?|?LinkedIn?|?YouTube?|?Twitter

Face Detector with VisionKit and SwiftUI

Ciklum India

We develop Digital Solutions for Fortune 500 and fast-growing organisations alike around the world.

领英推荐

Ciklum India Technology Nest

5,198 位关注者

Ciklum India的更多文章

社区洞察

其他会员也浏览了

An Overview of 3D Data Representations

Computer Aided Design — Research Paper

The Hello Kitty Prompt Project - Part 1

Paper Review: Lumiere: A Space-Time Diffusion Model for Video Generation

Compact and Reliable Visual Feedback with the GFX HAT for VIA AI Transforma Model 1

Voxy tool development so far ...

SuperMap Hi-Fi 3D SDK for Unity Configuration

Play the linking game - How to develop a razor-sharp memory [Part 2 of 4]

Exploring the development of dynamic fluid-based shaders.

An age old proven technique for image resizing

领英推荐

Ciklum India Technology Nest

5,198 位关注者

Ciklum India的更多文章

Sales Enablement Through Einstein: Revolutionizing Sales and Service

The Rise of Virtual Reality (VR) in Healthcare

Maximize Scalability and Flexibility with Our Cloud Engineering Services

Smart Finance: Transformative Effects of AI on the Financial Industry

The Evolution of Gaming and Simulation: Bridging Reality and Virtual Worlds

Unleashing the Power of AI: Best Practices for Enterprise Strategy and Deployment

Infogen Labs Embarks on a New Chapter of Innovation and Global Impact Through Partnership with Ciklum

Mastering Quality Assurance: Unlocking the Power of Automated Testing, Performance Tuning, and Accessibility Testing

MOBILE APPLICATION DEVELOPMENT | ADVANCED SENSOR INTEGRATION SERVICES | MOBILE GAME DEVELOPMENT

STUDIOS - UI UX | Design Thinking: Navigating Product Design and Optimization Through Visual Design

社区洞察

其他会员也浏览了

An Overview of 3D Data Representations

Computer Aided Design — Research Paper

The Hello Kitty Prompt Project - Part 1

Paper Review: Lumiere: A Space-Time Diffusion Model for Video Generation

Compact and Reliable Visual Feedback with the GFX HAT for VIA AI Transforma Model 1

Voxy tool development so far ...

SuperMap Hi-Fi 3D SDK for Unity Configuration

Play the linking game - How to develop a razor-sharp memory [Part 2 of 4]

Exploring the development of dynamic fluid-based shaders.

An age old proven technique for image resizing