Web AI Monthly #24: 2025 special!??Web AI powered Agents running client side??, medical applications, and real time diffusion with WebNN

Web AI Monthly #24: 2025 special!??Web AI powered Agents running client side??, medical applications, and real time diffusion with WebNN

Enjoy the content? Help me help you, by giving us a share with your colleagues in your weekly team updates to get more eyes on great work in the Web AI field. Made something cool to feature? Tag Jason Mayes if you make something noteworthy for future editions. We have subscribers ranging from decision makers (think C-level, VPs, and Directors) to folk on the frontlines using this stuff day to day (SWEs, web engineers and researchers). You never know who may see your creations.

2025 - the year of AI Agents in Web AI - what will you make?

AI Agents are the topic of 2025, but what if they could work in a Chrome Extension, or client side, on any website to do useful work for you in a privacy preserving manner? Well. it happened, we finally have agentic behaviors in the browser running entirely locally!

Web 4.0: AI Agentic behaviors using Web AI - client side smarts for advanced user experiences

Engineers, creatives, designers - lend me your ears, and eyes, to check my 1st tech demo of 2025: A #WebAI Agent running client side in the browser, that's capable of controlling a flight search webpage, to get the job done!

This is my gift to all of you to kick off 2025 and take our community into the realm of AI Agents and beyond. It's a starting point, not the end, so come join me on this adventure.

I want to share with you a vision for the future of the internet itself - a reality where AI agents and humans coexist in the browser on almost every website to make you more productive, and I got tired of waiting for the future to arrive, so made a prototype I think you're going to like...

This demo allows a user to interact with a fictional flights website in a natural way, using just their voice, and get useful work done:

Web AI Agents in action to allow a user to speak to a website to get work done!
Web AI Agents working in the browser powered by Google Gemma 2, 2B, by Jason Mayes 2025

This means when you go to a Web AI enabled website like this you do not need to learn a new UX for every site you go to. Instead you can just tell it what you want to do and it will do it for you, and ask for any extra information it needs along the way from you automatically.

How does it work?

Unlike demos by Anthropic that pretend to be a human using a mouse to click and interact with a GUI (which I believe is not the correct way to go about things at scale even if very cool technically speaking), in this implementation the developer exposes function definitions to the agent that it needs in order to perform various tasks that may be asked of it that it can call as needed to do whatever is asked of it.

This, I believe is the best way forwards today, as it puts the developer in charge of what functions can be called by an agent. This is important, as there may be legal reasons why you need to know if you are being used by a human or AI agent - eg accepting terms and conditions or making a payment. In this case the AI agent can do 90% of the work and then leave the user to click the final payment button or such. Until we have better standards in place, I think this is a more responsible way to use AI agents at scale.

The demo is powered by Google's Gemma 2, 2 billion parameter Web AI model, that I'm running using the Mediapipe Web LLM library. Kudos to Tyler Mullen who ported this newer model to the library and put it up on Kaggle to download. The performance of this new 2B model is roughly on par with the old 7B model from our Gemma 1 conversion which is quite amazing in itself.

Model inference is happening entirely locally on the client machine in the web browser using WebGPU via JavaScript. I'm sure you are curious to learn more so I made a video for your viewing pleasure that walks you through the creation in under 5 minutes:

How can I try it?

Once you have learnt how to use it from the video above try it for yourself on CodePen !

Please note: This is an experiment to show the potential of AI Agents that I made in just 4 days entirely from blank canvas, so likely needs more polish for production use case (maybe even finetuning the model for specific domain knowledge to get even better results for other industries).

Also note: The download is 2.5GB, so be on solid WiFi. This will be less of an issue once I implement caching (thanks Thomas Steiner, PhD ) or switch to using Chrome's built in prompt APIs in the future.

I've tested the model to work even on older machines with integrated GPUs but it will run much faster on a modern device (even if using an integrated GPU).

Naturally you need Chrome to run this as it leverages WebGPU for acceleration and only Chromium browsers support this at time of writing.

It's going to be a very interesting year ahead and I am excited to see what you all create by forking this (do tag me if you use it and make something cool).

Hone: A free web app that runs an LLM prompt across multiple rows of a spreadsheet

Continuing on the theme of doing useful work on the client side using an LLM check out this demo by Erik Hermansen that can fill out data in a spreadsheet. It can fill in each row with an answer to any question that you ask. Even better it's easy to use it in conjunction with Excel, Google Sheets, or other spreadsheet software. Give the post a like here on LinkedIn

Automatically fill out spreadsheet columns with Web AI

Try it yourself: https://decentapps.net/hone/

Github Code: https://github.com/erikh2000/hone

Web AI for medical applications

The healthcare industry has been an early adopter for Web AI demos going back all the way to 2019, but recently I was made aware of a new demo by Yury Rusinovich who made a custom vision model in collaboration with Universit?tsklinikum Leipzig A?R and Charité - Universit?tsmedizin Berlin using TensorFlow.js to predict disease outcomes using pedal angiograms.

What's a pedal angiogram?

Glad you asked, because I also didn't know. A "pedal angiogram" is a medical imaging procedure that uses X-rays and contrast dye to visualize the arteries in the foot. The results of which look something like this:

An example of what a pedal angiogram looks like

Essentially Yury trained a custom vision model to understand and classify this sort of data to decide what to do in the case of amputations.

Why is this cool?

Well for starters it outperforms traditional methods by a vascular specialist as quoted here:

"Computer vision can analyze angiograms and predict disease outcomes, demonstrating a significant correlation between predicted and actual limb salvage rates, outperforming IM GLASS segmentation by a vascular specialist. It has the potential to provide immediate and precise treatment results during vascular interventions, tailored to (inter)institutional expertise, and enhance individualized decision-making"

The model achieved a validation accuracy of 95% and a test accuracy of 93% in differentiating salvaged limbs from amputations. Really great work and I hope to see Web AI leveraged more in this field in the year to come to make our medical professionals more efficient and accurate than ever before.

Learn more in the original post here.

Realtime diffusion models running via WebNN in browser

What a year it's been, as if the above updates were not mind blowing enough, Eyal Gruss shared this very impressive demo running stable diffusion turbo image-to-image live on his webcam in Chrome entirely client side at 3 FPS on a 3070 Ti laptop:

Real-time stable diffusion running via WebNN in browser locally!

While 3PS may not sound high, this is a huge leap forward from the ~2 seconds (0.5 FPS) I have seen in the past. Maybe in 1 more year we will be at 30FPS ;-)

Show some love on X: https://x.com/eyaler/status/1848106256860602530

Github code: https://github.com/eyaler/webnn-developer-preview/

A throwback to 2024: JS Nation

Finally I will leave you with this video from 2024 to reflect how far we have come just 1 year later from when this was recorded and living up to my answers of making things deemed impossible in JS on my weekends to prove that it can. Huzzah!

Look forward to the year ahead - may your inferences be accurate and stay private with Web AI :-)

See you next time!

If you're new to this space and want to learn Web AI, you can get started fast with my free Google Developers course here (no background in AI needed, just a love for JavaScript and curiosity for AI - I will teach you from zero). Or get inspired through our growing collection of Web AI talks on YouTube or via my Show & Tell - I got you either way!

See you next time with even more great content and please do tag me (Jason Mayes ) if you make or find something for future editions - I need your help to find the latest and greatest news lovely #WebAI community as things are moving so fast!

Cheers!

Jason Mayes (that Web AI guy).

Jerome Etienne

Your social media 10x faster with ContentMagick.ai ???

1 个月

This looks awesome, Jason Mayes! Excited to see what's in store for #WebAI this year. Love the focus on real-time applications, this is great.! I can't wait to dive into the articles and learn more. Keep up the great work, everyone! ?? #AI #JavaScript

Gamei Chin

CMO at TrollWall AI | MBA | AI for Good | Community Lead @ WomenWhoCode

1 个月

Love seeing use case of Web AI for medical applications here! We definitely need more use cases to demonstrate the ease of adaptation (especially with Web AI!) in this highly regulated field. Excited for more #WebAI in 2025 ??

Great agents demo/video and overall edition Jason! Cheers to 2025 and another exciting year for #WebAI ??? ??

Jason Mayes

Web AI Lead @Google 13+yrs. Agent / LLM whisperer. On-device Artificial Intelligence / Machine Learning using Chrome | TensorFlow.js | MediaPipe. ?? Web Engineering + innovation ??

1 个月

Ram Iyengar Jeanine Banks Timothy Jordan Paul Kinlan Parisa Tabriz Paige Bailey Tris Warkentin Next web AI newsletter is out featuring AI Agents client side with Gemma 2 web, medical applications, and real time diffusion

Yury Rusinovich

Dr. med ? MHBA

1 个月

thank you so much, Jason Mayes! Your recognition means a lot to me. The work you and the Web AI team are doing is truly inspiring, and it’s an honor to be a part of this innovative community. Looking forward to seeing how Web AI continues to evolve and empower creators worldwide!

要查看或添加评论,请登录

Jason Mayes的更多文章

社区洞察

其他会员也浏览了