登录查看更多内容

Personal AI — A 5-Layer Grounding Framework for Personal AI Models

Sam Bobo

Product Manager of Artificial Intelligence @ Microsoft specializing in Conversational AI and Generative AI technologies | Former IBM Watson

发布日期: 2024年7月2日

Original post on Medium: Personal AI — A 5-Layer Grounding Framework for Personal AI Models | by Sam Bobo | Jun, 2024 | Medium

Stepping into the Client Experience Center at IBM Watson’s HQ in NYC, I was presented with a long flatscreen spanning at least 30ft in width. Suddenly, high quality 2-dimensional human figures appeared, cascading down the screen, each representing a different career: doctor, lawyer, human resources, financial advisor, etc. One by one, we stepped into the shoes of these individuals, understanding the pain points of today and the futuristic problem-solved universe of “tomorrow.” From a customer perspective, the empathy built for each of these personas resonated with at least someone in the audience enough to proceed with sales discussions.

The Watson Client Experience Center was an impressive AI-forward tour and qualitative demonstration on the power of AI. These avatars represented real human beings, real stories, in the real world where AI could make a positive difference. During my time at IBM Watson, I did get the opportunity to work with innovators who were tackling many of these problems, from solving cumbersome prior art research for patents to responding to both positive and negative reviews in the travel industry. I’ve long shared that the power of AI is real and the classic example of “solving” breast cancer by feeling an image model with hundreds-of-thousands of pictures annotated with indications of breast cancer and nearly the equivalent amount without. With the advent of Large Language Models helping to power new modalities of conversation and have a more conversational interface to backend systems, it democratizes access to information and also opens the aperture to new innovations.

In a recent conversation with a medical practitioner, I discovered a truly unbearable pain point — paperwork! The gentleman whom I conversed with was a highly regarded primary doctor. He shared his passion for treating patients, making that human connection, and providing answers and remedies (as applicable). His primary complaint — paperwork, 2–3 hours a night! Two examples include patient visit summaries, typically handled by a nurse or an physician assistant, and the other was renewing prescriptions. The former already has highly regarded AI products in market, Nuance Dragon Ambient Experience (DAX) now owned and managed by Microsoft. The latter, he continued, requires checking with the pharmacy to see if the prescription was picked up, prior history, and confirming eligibility for refill. Immediately in my mind came the notion of an Autonomous Agent with Human-in-the-Loop (HITL). In a futuristic example, the doctor could initiate the request for prescription refill and then agent would go and perform the sourcing and checking of information, providing the results back to the doctor to approve or reject.

The second anecdote, education. Teachers, primarily in K-12 education, are facing a number of challenges including student absenteeism, funding, cheating, skill gaps, and more! Creating lesson plans tailored to a cohort of students of varying skills, learning modalities, and motivation is a pedagogical art and the sheer quantity of work required to craft syllabi, detailed lesson plans, and assessments while abiding by school, local, state, and federal policies is not an easy task. Navigating these hurdles while maintaining a hyper-focus on the end goal of student learning outcomes can be quite daunting. New tools such as Khanmigo are available to teachers and students to aid in the learning process, as I referenced in “Education Updates at Major Developer Conferences .”

Lastly, for computer scientists and software engineers, breaking the flow of programming to write README files, set up code scaffolding, bug fixing, and code management is time consuming. Furthermore, for those entering the field of computer science or looking to program to solve a problem could find the modern app directory and file structure intimidating. Millions of developers and associated organizations are trusting Microsoft GitHub Copilot and Copilot Workspace to streamline development efforts and act as an expert pair programmer.

The three examples hit on the essential marketing claims of AI, that AI systems:

Never sleep and are ever-working, ever-present
Can encapsulate expert knowledge and ever-evolving learnings within its corpora of training
Shift work away from the human to allow for greater focus on essential aspects of the job

As my readers know, I am immensely passionate about the field of Artificial Intelligence and the impact the technology can have on society. The foundation exists to achieve such as vision, such as autonomous agents and visual recognition (healthcare), content translation and generation (education), and coding (engineering) to name a few, but society as a whole needs to shift its locus of obsession from Artificial General Intelligence (AGI) towards solving industry-specific needs. I will start to build a framework on optimizing AI practices to realize such a dream.

Traditionally, there were three layers to Artificial Intelligence Systems:

Foundation Layer
Industry Domain Models Layer
Use Case Layer

Today, I am proposing two additional layers to the model:

Likeness Layer — The Likeness Layer is a representation of oneself, personified and encapsulated as a tuned layer of the digital model. The likeness layer encompasses ones own personality, from writing style, tone, and self-image. In essence, the likeness layer is intended to add a personality layer as a personal, unique tuning that can function as a proxy in the digital world. This layer could either be a representation of one’s own likeness or a manifestation of the ideal version of oneself. Yes, I am drawing upon concepts such as “Second Life” and how people behave on social media but that is not the correct interpretation. Effectively, AI could be an extensions of oneself, from crafting content in one’s own language to imbuing the model with the same ethics and standards set for oneself. I highlight this as a unique layer as I have found that crafting information on my behalf does not generate words or phrases in my vocabulary that help communicate my thoughts or intent and I would like that represented.
Autonomy Layer — as described, the autonomy layer represents autonomy. Its the agentic aspect of the model that brings the entire solution to life. In the autonomy later, the model undergoes reinforcement learning through human feedback (RHLF) and contains red taping that guard rails the model to act on one’s behalf without straying away and/or getting impacted by malintent or malicious actors. This layer gives the freedom to take action by interacting with backend systems or taking action on your behalf, say in reference to the doctor I mentioned earlier in the article. This layer would take advantage of integrations and plugins required for the bot to work.

These five layers build the model expert that extends oneself into the “limitless” either and helps us as humans achieve more and fulfill that promise. It provides a basic understanding of the world (foundational layer), institutional training (industry layer), occupation, company, or task specific information (use case layer), oneself (likeness layer), and ability to take action as needed (autonomy layer). The true question is, how is this achieved? TRUST

Trust is by far the largest battle to be won within AI. I recently references the massive security credits required to participate in Artificial Intelligence solutions:

Where both companies thrive is a new architectural pattern emerging to combat the trust obstacle as well as optimize for latency: the hybrid cloud. For both Microsoft and Apple, sensitive computing occurs on-device with a portfolio of Machine Learning models (for example, the Phi models from Microsoft) and Apples proprietary models to reduce latency (a critical factor in user experience) and maintain privacy (well…so long as the information is encrypted). Any workloads that are much larger in nature such as large document summaries and non-personal open-ended questions such as a search query, are outsources to cloud-hosted models and third party models, OpenAI for both companies.

Both Microsoft and Apple had to earn the right to make AI more intrusive in one’s personal life through security and trust, which, is quite an undertaking given the rhetoric in society around data privacy and security and how AI plays into both aspects. The article also mentions the hybrid cloud infrastructure, using small language models on devise and large language models in the cloud to employ personal context (SLMs) where needed on devise and general knowledge (LLM) to fill the gaps. (I would be remiss if I did not reference the Expert Leader Agent Model as my plea to focus on industry models) In this framework, I only think the former, Small Language Models, are the correct model modality given the specific targeted use case one would be tackling with a complete 5-layer AI solution. Small Language models both perform well, can be locally integrated into personal computing devices, and can handle more robust encryption with integrated players (e.g Apple Silicon).

Finally, capitalizing on the vertical integration, understanding of one’s actions and the tasks required to complete them is essential for the autonomy layer. Microsoft with Windows, Apple with iOS, own the operating system later and thus the GUI and underlying APIs that developers build upon. This is a tremendous opportunity to mimic Large Action Models (LAMs) coined by Rabbit and its R1 device to understand how to invoke systems, tasks, etc required by the 5-layer AI solution to achieve that level of autonomy and replicate the steps the human would entail.

Imagine a futuristic world whereby all 5 layers of the AI solution evolved in harmony:

Foundational Layer — large organizations such as OpenAI, Mistral, Meta, etc continue to push foundational models (The Systematic Cycles of Large Language Models )
Industry Layer — humans collectively, via open source, compile industry specific knowledge (training data) and models (Mixture of Experts / Leader-Expert Model) that act as cartridges for the foundational layer) and backed by universities and organizations
Use Case Layer — additional training added by the employer organization or attuned by a sole proprietor that focuses on the specific methodology of a craft, target customer base, organizational culture, etc
Likeness Layer — a personal model (and corpus) that does a pseudo RHLF to attune to one’s likeness such as writing style, ethics, etc
Autonomy Layer — global focus on trust and hardening of model grounding to allow for autonomous actions to be taken.

Some day, we might be able to walk into the Watson Experience Center and see ourselves on that wall, this time, with the problem solves and new future progress challenges to resolve!

Personal AI — A 5-Layer Grounding Framework for Personal AI Models

Sam Bobo

Product Manager of Artificial Intelligence @ Microsoft specializing in Conversational AI and Generative AI technologies | Former IBM Watson

Speaking Artificially

555 位关注者

更多精彩文章

Speaking Artificially

555 位关注者

Claiming AI Advantages — Signals of Successful AI Companies

2024年11月22日

Generative Advertisements — Peering into the world of hyper-personalized advertising models

2024年11月15日

Adult Enterprises — Parenting AI Models into Maturity

2024年11月8日

Brand Diminishment in the Wake of Hype

2024年11月1日

Debating the AI-Generated Podcast Trend

2024年10月18日

Learning how AI “Learns”

2024年9月20日

Applying Aggregation Theory to AI

2024年9月13日

The Stability of Deterministic Outputs — Why Code Generation Tasks Show the Disparity in Trust Among GenAI Use Cases

2024年8月30日

The Very Hungry LLM

2024年8月23日

The Constraint Paradox

2024年8月16日