登录查看更多内容

Navigating the Future: Spatial AR Development with AI for Object Labeling and Placement - Part 2

Preston McCauley

Director of Emerging Technologies at Tonic3 | Executive Leader in AI, AR/VR, & UX | Driving Innovative #AI Solutions / XR Labs -Teacher, Speaker, Educator - expertise in agentic solutions like #crewai #ML

发布日期: 2024年3月4日

Welcome to part two of my three-part article on prototyping with AR and AI. I encourage you to go back and read part 1. To understand the entire process.

To begin with, we are going to break down the approaches used to move the POC 1 validations into the next gateway.

With the help of my brilliant developer, we confidently moved forward to the next stage of our work. The second proof of concept in the CIC approach aims to build upon rapid design and development cycles to validate further the techniques, tools, methods, and issues. These are broken down into quadrants.

This stage was initially focused on identifying and categorizing an image, which was a micro goal. I have gained significant practical experience with spatial devices, allowing me to create a potential flow that could quickly lead us to the spatial classification phase.

A simple workflow to illustrate how we intended to approach the in-quest flow and connect the Vertex AI

Though working around privacy concerns presented challenges, we remained undaunted and continued our concept. After all, we could only plan based on what we knew and were determined to succeed.

The first micro objective was establishing a working spatial Unity project to understand the spatial map and map the room planes (lightwork in a Hololens and ARKIT). However, it was still challenging in Meta Quest.

Leveraging my previous spatial work experience, I knew how to accomplish this goal. Working with my fabulous UX designer, we discussed this spatial mapping functionality and drafted some highly conceptual UI mocks that could be used to build upon our (POC) further. As you can see the interface is meant to be extremely minimal, because we need to keep the spatial visual areas manageable, If we didn't this would cause issues with the identificaiton process and overall clutter the camera scene. You always want to manage your holographic visual data like any other code structure. This meant we wanted to toggle all the UI and potentially existing labeled objects to be visible or non-visible.

While I realized the end product would likely not achieve the level of polish identified below, the mocks provided a helpful north star for the experience we wanted to aim toward while helping to keep an eye on how the UX would evolve.

We quickly explored some straightforward UIs in Figma to plan how to prototype the experience further in the meta Quest headset.

At the same time in parallel we were working on a another quadrant of the iteration.

First, our developer and I worked to ensure we could run the spatial app experience. It didn't need to be perfect; it was enough to be functional on the Quest 3. This process meant detecting the environment, the room, room data, and testing the camera and taking the screenshot.

I was incredibly excited because the Quest 3's color pass-through made this the perfect candidate device. This advancement is a massive milestone in cost versus capability with cheaper AR pass-through devices.

The next thing to note was that, much like what I shared in article one, for POC1, we needed to create a method to "highlight" the environmental target areas. As previously mentioned, I would place shapes around my images in tools I made in Python and Figma and generate pseudo highlights around objects.

We needed a method to replicate this process and highlight objects in the physical space. The obvious choice was to create a re-sizeable poly and include a shader to give it the same look as in POC1. This step required a few tests to correct the lighting after the Quest took a photo. If the image was too dark or the contrast was off, the AI could not interpret it. After a few examples, we found a decent balance, like in the video and images. We also had to play around with the permissions to determine what was possible with the image capture.

This step was an essential CIC fail/pass gateway for approaching a pass-fail gateway where permissions were challenging. Could we take a picture of the highlighted object, refer to the same method from POC 1 - and test whether the image was saved from the device and loaded back into memory to be analyzed? Thankfully, this was yes.

I noticed a few things along the way that were important. Firstly, the pass-through camera's resolution was lower than I had expected. However, this was not an issue for our case since I didn't necessarily require high-quality images. I tested image fidelity in POC1 to find a baseline.

UX Collective 2 年前

Paracosma: Enhancing Business Experience with AR/VR

Business Connect Magazine 1 年前

Propelling Ahead Of The Android AR Core Curve: A Guide…

InRhythm 10 个月前

Working with ideas based on an API to process the raw image was more practical than sending a complete one.
Additionally, sending multiple photos simultaneously might overload the vertex/vision AI context tokens. So batch processing is a concern.
Token limitations on multiple sends or resends could cause an error.

Despite these challenges, we had solid forward momentum, which was significant, but another challenge appeared; and for this one, we needed to discuss it intensely.

From the start, I had worked a lot with multi-dimensional spatial anchors; I knew how to use them and even had patents on techniques with them. We would be in great shape if we could link the spatial ID to our pseudo-highlight, but how would we approach this?

Over a series of conversations, we broke the problem down. The prompt modeling had to change, but to what?

It's not us, just a rendition of two people working built with some new #AI ideo.

In addition if multiple polys need to be labeled in a image how can we ensure that the highlighted image areas, representing a 1-to-many relationship, could be linked to the right spatial property when the response method was returned from the API with the object data payload?

This conversation led us down a new path and decision, as my AI developer was like.

"What if we could parse the object systematically?"

We discussed a few approaches to how the AI might read the image. This meant examining existing instructions to ensure the AI understood and creating new instructions to exhibit the behavior of systematically and repeatably analyzing the photos.

I quickly returned to POC one. I started revising my prompt structure to test and see if we could find an excellent model to instruct the AI on analyzing the image content. In a few hours, we had a new working revision.

After more discussion, the approach was to read the image from a constant left to right and assign object highlights as a JSON-like object with ID 1, 2, 3, etc. Of course there are still further refinements, if polys overlap or touch near edges, that can confuse the system.

Creating a systematic metric of confidence, method of approach, plus other elements in the core prompt allowed to refine the techniques.

We now had images, object highlighting, spatial identifiers, spatial vision-linked markers, and all the AI behind it working with a real camera. The last part was to enable the popup-like tooltip bubbles as the data came streaming back (like in the video). Using these steps together, we can identify and generate new tool possibilities that could be leveraged on-demand in spatial systems. So, where does this lead us next?

Please join my next article as I share more about the CIC model. The foundation of an innovation model to prototyping in the age of AI. this particular project, and where we go from here!

TOMEK AI

8 个月

Fascinating insights on the convergence of AR and AI in spatial systems; looking forward to seeing how these technologies continue to evolve and intersect!

查看更多评论

要查看或添加评论，请登录

Preston McCauley的更多文章

AI Image Generation Made Easy"ish": Setting Up ComfyUI and FLUX.1 on Mac M1 - Image AI

2024年9月3日

AI Image Generation Made Easy"ish": Setting Up ComfyUI and FLUX.1 on Mac M1 - Image AI

Let's get you set up with ComfyUI and test out the newest Flux model. I initially set out to optimize the tensors for…

2 条评论
The Beauty of Code: Exploring the Intersection of AI, Art, and Technology - Luna Codex

2024年8月15日

The Beauty of Code: Exploring the Intersection of AI, Art, and Technology - Luna Codex

By Preston McCauley Introduction Why visualize code as art? Everyone is scared of AI overrunning artistic expression…

9 条评论
Navigating the Future: Spatial AR Development with AI for Object Labeling and Placement - Part 1

2024年2月16日

Navigating the Future: Spatial AR Development with AI for Object Labeling and Placement - Part 1

Background Story My fascination with augmented reality (AR) and its potential across various industries has led me down…

5 条评论
How Conversational AI can Transforming Self-Directed Learning: A Personal Journey

2023年11月21日

How Conversational AI can Transforming Self-Directed Learning: A Personal Journey

Often, I tell my students the importance of pursuing active learning. It's essential to personal and professional…

3 条评论
Leveraging Artificial Intelligence for In-Depth User Experience Testing And Analysis: A Comprehensive Exploration

2023年9月12日

Leveraging Artificial Intelligence for In-Depth User Experience Testing And Analysis: A Comprehensive Exploration

Introduction With a 25-year career as a UX educator, practitioner, AI architect, Augmented and Virtual Reality…

8 条评论
Maximizing Personalized Learning with Generative AI: A Comprehensive Guide to Enhancing Education and Skill Building

2023年8月14日

Maximizing Personalized Learning with Generative AI: A Comprehensive Guide to Enhancing Education and Skill Building

When working with generative #AI systems like chatGPT, Bard, Bing, Claude, or your own LLM, you can use the following…

2 条评论
20+ Tips For Approaching Photoshop Generative AI For Everyday Design Tasks

2023年5月24日

20+ Tips For Approaching Photoshop Generative AI For Everyday Design Tasks

I titled this artwork "In The Mind Of The Artist." - I think it shows really how powerful these new design tools are…

3 条评论
Exploring the Layers of AI Systems: A Beginner's Guide

2023年5月9日

Exploring the Layers of AI Systems: A Beginner's Guide

AI Layers: An Introduction Our previous article discussed how AI could help us learn faster and why it's essential to…

2 条评论
Designing an AI Companion for Business Innovation and Thought Leadership

2023年4月26日

Designing an AI Companion for Business Innovation and Thought Leadership

AI Through Creative Learning And Design At the forefront of business innovation lies a thrilling journey, teeming with…
Innovating Your Career Through An Active Learning Framework

2019年12月26日

Innovating Your Career Through An Active Learning Framework

As a teacher, mentor, speaker, and business professional. I have the opportunity to provide a lot of people with…

5 条评论

See all articles

Navigating the Future: Spatial AR Development with AI for Object Labeling and Placement - Part 2

Preston McCauley

Director of Emerging Technologies at Tonic3 | Executive Leader in AI, AR/VR, & UX | Driving Innovative #AI Solutions / XR Labs -Teacher, Speaker, Educator - expertise in agentic solutions like #crewai #ML

领英推荐

Preston McCauley的更多文章

社区洞察

其他会员也浏览了

4 ways you use AR to track your life

Predictions from the first AR OS

The Future of Spatial Design: How Generative AI is Transforming Augmented Reality

Revisiting Interfaces after the pandemic: an interview with Alessio Grancini

How AI is Revolutionizing UI/UX Design: The Future of User Experience

How Snap AR Glasses Are Revolutionizing Task Development and Boosting Social Media Coding

"Adapting UI/UX Design for the Age of AI: Essential Skills and Strategies"

Exploring the Potential of Generative AI in UX Design: A Journey of Discoveries and Innovations

Generative AR In Our Homes

3D Mapping for Augmented Reality

领英推荐

Preston McCauley的更多文章

AI Image Generation Made Easy"ish": Setting Up ComfyUI and FLUX.1 on Mac M1 - Image AI

The Beauty of Code: Exploring the Intersection of AI, Art, and Technology - Luna Codex

Navigating the Future: Spatial AR Development with AI for Object Labeling and Placement - Part 1

How Conversational AI can Transforming Self-Directed Learning: A Personal Journey

Leveraging Artificial Intelligence for In-Depth User Experience Testing And Analysis: A Comprehensive Exploration

Maximizing Personalized Learning with Generative AI: A Comprehensive Guide to Enhancing Education and Skill Building

20+ Tips For Approaching Photoshop Generative AI For Everyday Design Tasks

Exploring the Layers of AI Systems: A Beginner's Guide

Designing an AI Companion for Business Innovation and Thought Leadership

Innovating Your Career Through An Active Learning Framework

社区洞察

其他会员也浏览了

4 ways you use AR to track your life

Predictions from the first AR OS

The Future of Spatial Design: How Generative AI is Transforming Augmented Reality

Revisiting Interfaces after the pandemic: an interview with Alessio Grancini

How AI is Revolutionizing UI/UX Design: The Future of User Experience

How Snap AR Glasses Are Revolutionizing Task Development and Boosting Social Media Coding

"Adapting UI/UX Design for the Age of AI: Essential Skills and Strategies"

Exploring the Potential of Generative AI in UX Design: A Journey of Discoveries and Innovations

Generative AR In Our Homes

3D Mapping for Augmented Reality