登录查看更多内容

Grounding the World Around Us — Inference on the Edge

Sam Bobo

Product Manager of Artificial Intelligence @ Microsoft specializing in Conversational AI and Generative AI technologies | Former IBM Watson

发布日期: 2025年1月24日

Original Post on Medium: Grounding the World Around Us — Inference on the Edge | by Sam Bobo | Jan, 2025 | Medium

Take a moment and look about the room around you, notice the colors, sounds, people, and objects. They say a picture is worth a thousand words, correct? Well, try to compute the number of thoughts, stimuli, and information passing through the neurons of your brain during our exercise.

The Artificial Intelligence community around us are clamoring about the lack of information and data available to train the newest multi-billion parameter models. On a complete orthogonal path resides a path dedicated to improving model response “intelligence” through applying reinforcement learning or using sophisticated prompting techniques to improve reasoning at the tradeoff of response time, call it “thinking time” if you will. After listening to a podcast recently about Meta’s AR plans, the realization finally occurred to me and prompted the creation of this article: the physical world around us will serve as physical model grounding unique to our individual needs. Let me explain.

Go back to my earliest example of Augmented Reality and Artificial Intelligence, the language learning application I described in Augmenting Reality with the Overlay of AI

My concept — fusing together the augmented reality-powered Google Glasses with the software functionality of Google Translate. Note: This was a silver-level project so the concept manifested in a clickthrough demonstration. My concept entailed a user wearing Google Glasses at the top of a ski slope looking out into horizon. The wearer utters the phrase “Hey Google, quiz me in Spanish.” Instantly, rectangular bubbles with question marks overlay on top of ordinary concepts such as “snow,” “tree,” “mountain,” “skis,” and so on. The user points to an object within his/her view and utters “nieva” (in Spanish) to guess the Spanish term, getting instant feedback on whether the uttered word was correct or incorrect. Effectively, this was an early concept using machine translation and augmented reality to build a language learning application.

Simply put, the visual input from the glasses camera (ergo your eyes) acted as grounding data for any inquiry posed to the underlying model. Simply uttering, “how do you say this in Spanish?” may quickly interpret the visual recognition of “snow” and return “nieva.” In the vision video produced for Microsoft during the Build conference, the engineer working in the machine could easily ask questions generically and get responses grounded in the world around them.

Lets get technical quickly. For reference, grounding in the AI world refers to information passed to the model as truth. Historically this was done in the supervised machine learning tasks of building a model, however, in the era of foundation models, the concept of grounding quickly shifted to supportive information injected at runtime. Commonly within Retrieval Augmented Generation (“RAG”) patterns, the model will reference grounding information, perform the task at hand, and return the result. One common example is asking a question to a Generative AI bot, referencing a FAQ webpage, and returning a response that summarizes the results. The same goes for asking questions, say, against a contract to get responses.

领英推荐

Artificial Intelligence is the future & the future is…

100X.VC 2 年前

How to Build an AI App? Steps, Features, Costs, Trends

VIBIDSOFT PVT LTD 6 个月前

How artificial intelligence will impact the gaming…

Gameplay Galaxy 2 年前

This trend of real-time information capture for natural language based models started recently and only until now have I started to piece the information together. Lets explore a few scenarios:

· Microsoft Recall — Recall is a feature available to Copilot+ PCs whereby a question can be asked to the native Copilot model on the machine about anything (emphasis on anything) that happened on the computer and get a response. For example, asking questions about a tech strategy article from a known analyst one visited in the morning or the marketecture diagram discussed with your team during an online meeting. This is done through taking frequent snapshots of your screen and stitching together a “timeline” of events to pass as grounding into the model when asking a question.

· Pins — A new market is emerging for AI backed pins. These pins are basic hardware that performs one basic function — records audio either constantly or upon wake up and sleep words and streams to a (hopefully secure) cloud storage. Thereafter within the corresponding application, one can ask questions against anything that transpired during the day and captured by the pin. For example, say you wanted to recall a specific point about a conversation you and your brother had earlier in the day, one can effectively pull that information, using the recordings as grounding.

Now layer on smart glasses that visually record the world around you constantly. Do you see the trend? The world is becoming grounding data layered on top of foundation models (or specialty models presumably in the case of the Microsoft video). Effectively speaking, information captured on the edge is augmenting foundation models for personal use. One might even take this a step further and claim that the information being captured can serve as a tap in the constant flow of new information into training newer models. Specifically, as adoption of these IoT Edge devices scale, so does the data.

That brings me to the final point in this post, hesitations with the adoption of AR and other edge computing devices — data privacy. While I have not specifically read upon all of the terms and conditions of these features and devices, one can only be skeptical that, in a world of scarcity (data, at this point), tradeoffs might be made to acquire additional resources. This parallels scraping YouTube videos for video-generation models or other information generally available on the web and sure augments the points made by data privacy activists. I simply encourage that (1) solution providers do not skirt around the fundamentals of data privacy when faced with an ever-abundant flow of new data and (2) are transparent with how information is treated for those users. Furthermore, customers should heed caution when adopting these devices in the short term until trust is built in the general populus.

Overall, marrying AI with Augmented Reality is a massive opportunity (I personally am experimenting with start-up ideas that combine these two technologies) and grounding the world around us makes the value provided by the models that much more immense but should also ensure that information is used responsibly and kept secure to foster the adoption and help bring our collective visions to reality.

Speaking Artificially

590 位关注者

要查看或添加评论，请登录

Sam Bobo的更多文章

Preserving AI for Social Good — How AI can Preserve History

2025年2月28日

Preserving AI for Social Good — How AI can Preserve History

Original publication on Medium: Preserving AI for Social Good — How AI can Preserve History | by Sam Bobo | Feb, 2025 |…
Adaptation Artifacts for Tuning Models

2025年2月21日

Adaptation Artifacts for Tuning Models

Originally posted on Medium: Adaptation Artifacts for Tuning Models | by Sam Bobo | Feb, 2025 | Medium Perfect theories…
My Journey into AI

2025年2月14日

My Journey into AI

Original Post on Medium: My Journey into AI. A short post detailing my entry into AI… | by Sam Bobo | Feb, 2025 |…

1 条评论
The Opportunities Lost with Human Replacement

2025年1月31日

The Opportunities Lost with Human Replacement

Original post on Medium: The Opportunities Lost with Human Replacement | by Sam Bobo | Jan, 2025 | Medium “The Giver”…
AI Model Management and Lock-In Potentials

2025年1月17日

AI Model Management and Lock-In Potentials

Original Post in Medium: AI Model Management and Lock-In Potentials | by Sam Bobo | Jan, 2025 | Medium Entering the…
Blogging Year in Review 2024

2024年12月27日

Blogging Year in Review 2024

Originally posted on Medium: 2024 Blogging Year in Review. Humbled by my readership, I recap 2024… | by Sam Bobo | Dec,…
Standardizing for Commoditization — Anthropic’s Model Context Protocol (MCP)

2024年12月20日

Standardizing for Commoditization — Anthropic’s Model Context Protocol (MCP)

Originally posted on Medium: Standardizing for Commoditization — Anthropic’s Model Context Protocol (MCP) | by Sam Bobo…

1 条评论
A Thesis on Artificial Intelligence

2024年12月6日

A Thesis on Artificial Intelligence

Original post on Medium: A Thesis on Artificial Intelligence | by Sam Bobo | Dec, 2024 | Medium Artificial Intelligence…

1 条评论
Smoothing the Double Edge Sword of Tools

2024年11月29日

Smoothing the Double Edge Sword of Tools

Unlocking the full potential of productivity remains an inspiration — from large corporations, to small teams or…

1 条评论
Claiming AI Advantages — Signals of Successful AI Companies

2024年11月22日

Claiming AI Advantages — Signals of Successful AI Companies

Original Post on Medium: Claiming AI Advantages — Signals of Successful AI Companies | by Sam Bobo | Nov, 2024 | Medium…

See all articles

Grounding the World Around Us — Inference on the Edge

Sam Bobo

Product Manager of Artificial Intelligence @ Microsoft specializing in Conversational AI and Generative AI technologies | Former IBM Watson

领英推荐

Speaking Artificially

590 位关注者

Sam Bobo的更多文章

社区洞察

其他会员也浏览了

? Artificial Intelligence: The Good, The Bad & The Ugly.

GPT-4o is more breakthrough than you think

Introduction to AI and ML: Understanding the Basics of Artificial Intelligence and Machine Learning

Is GPT-2 Coming Back? What's The Secret Behind R1?

Introduction to AI and ML: A Beginner's Guide

Are AI & Machine Learning the Same?

Face landmark detection using Google's media pipe

?? This AI Makes Big Tech Panic

How does Artificial Intelligence influence business performance?

Top AI/ML Papers of the Week [25/11 - 01/12]

领英推荐

Speaking Artificially

590 位关注者

Sam Bobo的更多文章

Preserving AI for Social Good — How AI can Preserve History

Adaptation Artifacts for Tuning Models

My Journey into AI

The Opportunities Lost with Human Replacement

AI Model Management and Lock-In Potentials

Blogging Year in Review 2024

Standardizing for Commoditization — Anthropic’s Model Context Protocol (MCP)

A Thesis on Artificial Intelligence

Smoothing the Double Edge Sword of Tools

Claiming AI Advantages — Signals of Successful AI Companies

社区洞察

其他会员也浏览了

? Artificial Intelligence: The Good, The Bad & The Ugly.

GPT-4o is more breakthrough than you think

Introduction to AI and ML: Understanding the Basics of Artificial Intelligence and Machine Learning

Is GPT-2 Coming Back? What's The Secret Behind R1?

Introduction to AI and ML: A Beginner's Guide

Are AI & Machine Learning the Same?

Face landmark detection using Google's media pipe

?? This AI Makes Big Tech Panic

How does Artificial Intelligence influence business performance?

Top AI/ML Papers of the Week [25/11 - 01/12]