登录查看更多内容

Why Large Action Model (LAM) Will Shape the Future of Tech?

Sooraj Divakaran

B2B Tech Marketer | prev. Lenovo, Infosys, TCS | ISB Alum

发布日期: 2024年1月16日

Steve Jobs believed that 'devices' would someday turn into a 'bicycle for the mind,' but the effect on some of us is similar to that of smoking or junk food.

We're addicted to our screens; the average person spends 6 hours and 58 minutes per day on screens connected to the internet.

On average, people check their phones 58 times per day. And almost 52% of phone checks (30 per day) occur during work hours.

All of us have, at one point, questioned whether the device inside our pocket has made us more productive or lazier.

When Jesse Lyu got on stage to present rabbit inc. 's new device, he made a statement that resonated with me. He said, "Our smart devices have become the best way to kill time instead of saving it."

Our smart devices have become the best way to kill time instead of saving it.

If you examine the list of the most downloaded applications for 2023, over half are designed with a singular purpose: to kill your time. Of course, these apps will claim otherwise.

But more recently, a host of new platforms have started leveraging artificial intelligence to make us more productive.

OpenAI might have opened Pandora's box as far as ease of finding information is concerned, but the Large Action Model (LAM) will finally close the lid on it.

What is LAM?

Large Action Model (LAM) has been designed to understand how humans interact with computer programs. Unlike previous methods, LAM directly understands how different programs work and what users do on them without needing to use text as a middle step.

The question I've asked myself is why a Large Action Model (LAM) is needed when we have made so many recent advances in natural language models and computer vision.

Neural language models have enabled machines to understand better and respond to human language. Speech recognition and synthesis technologies have also improved, allowing the creation of machines that understand human intentions deeply and contextually in real time.

This progress has led to a new way of interacting with devices using spoken language rather than touch. It started with smart speakers and has expanded to AI chatbots and operating systems with natural language interfaces.

However, designing these devices poses challenges, such as the lack of application programming interfaces (API) for major service providers.

To overcome this, platforms like rabbit inc. use neuro-symbolic programming to learn user interactions directly without relying on rigid APIs.

Large Action Model (LAM) aims to better understand human intentions expressed through actions on computers and, by extension, in the physical world. The emphasis is on learning and interpreting user actions rather than relying on predefined interfaces.

What Problem is LAM Trying to Solve?

The way people interact with computers is different from how they use natural language or vision. The way applications work is more structured than a picture and more detailed and messy than a sentence or a paragraph.

Rabbit's Large Action Model (LAM) needed different qualities compared to a model that only understands language or vision.

For example, while it's fine for a smart chatbot to be creative, actions learned by LAM on applications should be very regular, simple, stable (not changing too much), and easy to explain. This approach aligns with Occam's razor, which suggests that simpler explanations are often better.

Let's consider a specific example related to an action performed on a computer application. Imagine a user interacting with a photo editing app:

Traditional Chatbot (Non-LAM):User: "Make my photo look unique and creative!"Chatbot generates a creative and unique filter for the photo, incorporating various artistic elements.

In this case, the chatbot is expected to be creative and come up with a unique solution like the user asked.

LAM-Learned Action (LAM):User: "Enhance the brightness of this photo."LAM has learned specific, regular actions related to photo editing and applies a standard brightness enhancement to the photo.

Here, the LAM-learned action is highly regular and minimalistic. Instead of being creative, it sticks to a straightforward, predictable action—adjusting brightness. This aligns with the idea that actions on applications learned by LAM should be regular, minimalistic, and stable.

领英推荐

This AI newsletter is all you need #29

Towards AI 2 年前

Enterprise AI 2.0: Crafting the Future with Generative…

Dr. Vivek Pandey 1 年前

ODSC’s AI Weekly Recap: Week of February 9th

Open Data Science Conference (ODSC) 1 年前

Language Models Fail to Comprehend Applications with Raw Text

Language models face challenges in understanding applications when presented with raw text.

Even the most advanced large language models, equipped with their current tokenizers, struggle to accommodate the representation of raw-text applications within their context window.

In simpler terms, these models find it hard to fully grasp the content and structure of applications when they are in raw text format, like the HTML of a webpage.

How Large Action Model (LAM) Works?

LAM's way of modeling is based on imitation or learning by demonstration. It watches how a person uses an interface and aims to accurately replicate the process, even if the interface changes a bit.

Unlike a black-box model that outputs actions without control, LAM's approach is more transparent. Once it learns from a demonstration, it directly applies the learned routine to the target application without the need for continuous observation or adaptation.

This makes the process more understandable, and any technically trained person can inspect and understand the "recipe" or steps involved.

As LAM keeps learning from demonstrations, it builds a comprehensive understanding of every aspect of an application's interface. It essentially creates a "conceptual blueprint" of the underlying service provided by the application.

In simpler terms, LAM acts like a bridge, connecting users to the services offered by an application through its interface.

Why Large Action Model (LAM) May Shape the Future of Tech?

We've all been sold on the promise of AI assistants and, more recently, AI workers— be it a physical device or a Chrome-based extension that can serve as your personal secretary. However, neither has lived up to our expectations.

When Humane made its announcement last year, you finally felt that the device had serious potential. However, the pricing was not within reach for people who were considering purchasing the device.

The subscription for the device was a complete bummer - you need us to pay for a device that we're not even sure whether we actually need.

But the idea of an AI companion or AI worker is promising; if LAM can somehow fulfill this void, then they could possibly disrupt the market. Large Action Models could perhaps be trained to perform complex tasks by accessing multiple or more platforms.

When Jesse Lyu was presenting Rabbit R1, it reminded me of the presentation made by Dag Kittlaus at TechCrunch Disrupt seven years ago. Dag and his team were building Viv Labs after selling Siri to 苹果 .

However, Viv was later sold to 三星电子 and became Bixby. When Dag was asked why they decided to sell to Samsung, he replied, "They ship 500 million devices a year.' You asked me onstage about our real goal, and I said ubiquity."

They ship 500 million devices a year.' You asked me onstage about our real goal, and I said ubiquity.

Of course, 三星电子 has spent the last few years improving Bixby, but it is nowhere near what Dag's team imagined it would become.

You can only hope that Jesse Lyu has learned from his experience of selling 渡鸦科技 to 百度 , considering that fulfilling the promise of AI workers or AI assistants requires powerful hardware and software.

If you manage to fulfill the promise, you're possibly looking at a $14.77 Billion market, maybe even larger, considering the use cases keep evolving.

Message for the Reader: AI workers and assistants are pretty clever, but discerning human emotions is not their strong suit. Hence, I'm reaching out to you, the discerning reader, with a humble request: if you found this article delightful, consider giving it a thumbs up or sharing it.

Your human touch is the true test of its likability!

Sounding Board

1,279 位关注者

Alex Carey

AI Speaker & Consultant | Helping Organizations Navigate the AI Revolution | Generated $50M+ Revenue | Talks about #AI #ChatGPT #B2B #Marketing #Outbound

1 年

It's important to be mindful of our screen time and find a balance.

1 次回应

Sharanbir Kaur

I thrive on growing businesses through Digital Transformation & Strategy | Client Partner @ Meta | AI Powered Marketing | Digital Leader | Follow for Real World Digital Marketing Knowledge

1 年

Fascinating read! If there is one thing that’s stood out, that’s the interaction between computer and humans. Well, there will be efforts made to make human interaction with a computer as native as possible, it is very hard nut to crack.

1 次回应

查看更多评论

要查看或添加评论，请登录

Sooraj Divakaran的更多文章

The Signature Spice: How Flavor Memory Creates Unbreakable Brand Loyalty

2025年3月18日

The Signature Spice: How Flavor Memory Creates Unbreakable Brand Loyalty

The Ragda Pattice I Can't Forget The best Ragda Pattice in Mumbai didn't come from a restaurant with Michelin stars. It…
Focus on What Won't Change: A Marketer's Guide to the AI Revolution

2025年3月11日

Focus on What Won't Change: A Marketer's Guide to the AI Revolution

Almost every second article about AI reminds me of the 1998 movie where an old man is asked what he sees, and the man…
The Zero Conversation Economy: When Silence Speaks Volumes

2025年3月4日

The Zero Conversation Economy: When Silence Speaks Volumes

Some evenings I step out of my house and head over to the local chatwala. The moment I reach there, he takes a good…

5 条评论
Are IT Services Companies Ready for the AI Future?

2024年6月27日

Are IT Services Companies Ready for the AI Future?

Are you future-ready? It's the most commonly asked question in Annual General Meetings. In the context of IT Services…

1 条评论
What Does it Take to Win a Cannes Creative Lion?

2024年6月14日

What Does it Take to Win a Cannes Creative Lion?

With Cannes just around the corner, it's that time of year when I dive into YouTube for Case Studies. I've always felt…

3 条评论
How Much Should Indian Brands Spend On Content Marketing?

2024年6月7日

How Much Should Indian Brands Spend On Content Marketing?

If you ask Google what percentage of your budget should be spent on content marketing, the answer that pops up is…

1 条评论
Why Should B2B Brands Invest in Brand Building in 2024?

2024年5月31日

Why Should B2B Brands Invest in Brand Building in 2024?

All anyone hears in the marketing world these days is 'Khata Kat Khata Cut.' No, this isn't about a recent political…

3 条评论
Why Brand Building Matters for Firmer Pricing?

2024年5月21日

Why Brand Building Matters for Firmer Pricing?

The last few years have been challenging for retailers. Any attempt to increase product prices is promptly rejected.
Percy Williams: Lessons from Sports History for Business

2024年5月14日

Percy Williams: Lessons from Sports History for Business

It feels like ages since I last revisited Percy Williams's story. The first time it crossed my path was at Infosys's…
What's B2B Blue?

2024年5月7日

What's B2B Blue?

Ever feel like a lone wolf in the world of B2B marketing, needing a sounding board for those tough questions? You're…

1 条评论

See all articles

Why Large Action Model (LAM) Will Shape the Future of Tech?

Sooraj Divakaran

B2B Tech Marketer | prev. Lenovo, Infosys, TCS | ISB Alum

What is LAM?

What Problem is LAM Trying to Solve?

领英推荐

How Large Action Model (LAM) Works?

Why Large Action Model (LAM) May Shape the Future of Tech?

Sounding Board

1,279 位关注者

Sooraj Divakaran的更多文章

社区洞察

其他会员也浏览了

Qwen2.5-VL: A Leap Forward in Multimodal Understanding and Real-World Applications

IT behemoths and their AI experiments

Comprehending the Fundamental Terms and Ideas of Artificial Intelligence — The ABCs of AI

Tech Odyssey 2024: Journey Through AI, Web, and Mobile Innovations

The Various Common Forms of AI: From Narrow Artificial Intelligence to Superintelligence

Ai is the Future: Exploring the Exciting Possibilities of Artificial Intelligence

AI and You: A Guide for Business Owners New to AI

Beyond ChatGPT: Silicon Brain Technology

Artificial Intelligence - Its Pros And Cons

Inside an LLM: Journey Through the Neural Symphony of Transformers & Next-Word Predictions ???

What is LAM?

What Problem is LAM Trying to Solve?

领英推荐

How Large Action Model (LAM) Works?

Why Large Action Model (LAM) May Shape the Future of Tech?

Sounding Board

1,279 位关注者

Sooraj Divakaran的更多文章

The Signature Spice: How Flavor Memory Creates Unbreakable Brand Loyalty

Focus on What Won't Change: A Marketer's Guide to the AI Revolution

The Zero Conversation Economy: When Silence Speaks Volumes

Are IT Services Companies Ready for the AI Future?

What Does it Take to Win a Cannes Creative Lion?

How Much Should Indian Brands Spend On Content Marketing?

Why Should B2B Brands Invest in Brand Building in 2024?

Why Brand Building Matters for Firmer Pricing?

Percy Williams: Lessons from Sports History for Business

What's B2B Blue?

社区洞察

其他会员也浏览了

Qwen2.5-VL: A Leap Forward in Multimodal Understanding and Real-World Applications

IT behemoths and their AI experiments

Comprehending the Fundamental Terms and Ideas of Artificial Intelligence — The ABCs of AI

Tech Odyssey 2024: Journey Through AI, Web, and Mobile Innovations

The Various Common Forms of AI: From Narrow Artificial Intelligence to Superintelligence

Ai is the Future: Exploring the Exciting Possibilities of Artificial Intelligence

AI and You: A Guide for Business Owners New to AI

Beyond ChatGPT: Silicon Brain Technology

Artificial Intelligence - Its Pros And Cons

Inside an LLM: Journey Through the Neural Symphony of Transformers & Next-Word Predictions ???