Web ML Monthly #16: 1 Billion downloads, Jason's crystal ball Gen AI predictions for 2024, Adobe goes Web AI, and hello south east Asia <3
Jason Mayes
Web AI Lead @Google 13+yrs. Research & Machine Learning | On-device Artificial Intelligence | Chrome | TensorFlow.js | MediaPipe. ?? Web Engineering + innovation ??
Ahoy Tensors! Happy holidays and hello 2024. If 2023 was anything to go by, this year is going to move faster than ever. In fact, on that note, things are moving so fast for me right now I did not have time to write the last 2 newsletters as so many work projects going on, so this one is going to be a bit of a quarterly round up! If you enjoy the content, please do give us a share with friends, colleagues, family - everyone is welcome.?Let's go!
A review of 2023 - what a year for growth!
2023 was a year for exponential growth in this community (I feel like a proud father seeing his child grow up) and a significantly huge push from folk even beyond Google's core Web ML teams (TensorFlow.js, Chrome, MediaPipe Web) who are really embracing Web AI thanks to commitment from companies like Hugging Face (Transformers.js and HuggingFace.js) and 微软 (Onnx runtime web) investing in the web space giving Pytorch users a way forward to embrace client side web deployments in the browser too.
In fact looking at publicly available usage we can see healthy upticks in WEEKLY usage across the board in 2023 which is great for continued innovation in the space:
TensorFlow.js weekly usage:
ONNX Runtime Web weekly usage:
Transformers.js weekly usage:
Pretty impressive! One common theme between all of these graphs is that they all look steady to grow in 2024. Nice job team JS.
I'm also pleased to announce here at Google across TensorFlow.js and Mediapipe web solutions we crossed 1 BILLION cumulative downloads for library and models for the first time - historic world first!
But why did this happen and what could this space look like in 2024?
Predictions for 2024 - crystal ball time
Which brings me to my crystal ball moment of this edition.
"TLDR: 2024 will be the year of shrinking Gen AI models through techniques such as distillation, along with "hybrid AI" approaches. Especially for models that need to do a certain class of tasks well (e.g. code completion, summarization, translation, speech recognition etc). In this case smaller niche models will be made, falling back to server only when client not powerful enough to run that smaller model or if state of the art performance required (hybrid AI). You heard it here first." Jason Mayes, Dec 2023.
I predict that in 2024 AI in JavaScript, client side in the browser, using technologies like WebGPU, WebAssembly, WebGL et al, will continue to rise exponentially across the popular stacks mentioned above while further embracing conversions of Gen AI models for more specific niche tasks - which is where the strength of on device machine learning really can shine, so they can fit on a single GPU client side in browser reducing costs for all.
Currently there are many HUGE large language models out there from all the big names, that are costly to run on the server, resulting in many top companies now charging for access to their APIs as it's just too expensive to run for free.
Therefore, my big prediction is that 2024 will be a year of compacting, distilling, and generally shrinking of Gen AI models so they can run comfortably on a high power consumer devices (as a starting point) for more specific or niche tasks (eg code completion, summarization, translation etc) with web being the obvious choice to deploy as over 4 billion browser enabled devices and everyone having a website they want to add smarts to for seamless customer interactions or killer features in the web versions of their apps (like Adobe Photoshop below).
Now what about low power devices I hear you cry?!
Well to solve that we shall likely see a "hybrid" approach evolve in 2024. This is whereby if your client device has enough horsepower to run the ML model locally, it will download and cache the model to entirely remove the need for any server side inference. GG.
If it is a lower end device however, the model could be broken up between client and server, doing the more trivial things on device (like tokenizing sentences before passing to model), and the heavier processing server side, until hardware catches up. This breaking up of the model also helps to keep some of the benefits of Web AI, such as privacy, as you would be sending high dimensional encodings from lower layers of the model instead of the raw data itself (which is a little nicer than sending the raw audio or image data).
With time folks like 英特尔 , AMD , 高通 et al will redesign their hardware such that even mid-low tier devices will have more shared memory available at lower price points so that more AI workloads can be offloaded to the client device entirely, using either the CPU or GPU, reducing server costs further, which will in turn drive up the profitability of the companies that offer server side solutions currently. The server will become a place to train models, or host the largest most cutting edge models, and all else will slowly migrate to client side inference, because it makes business sense if you want to actually make profits and scale to millions of users.
Right now, having enough memory is the main bottleneck, given most consumer devices have around 8GB - 16GB CPU RAM or 8GB dedicated VRAM on GPUs. Shared architectures like Apple's M1/M2 have proven to work very well for Web AI in the browser in our testing, only being beaten by the likes of NVIDIA 4080/4090s et al for raw AI client side performance in browser.
Don't believe me? Keep reading to see that it's already happening...
Epic launches since last time
Adobe Embrace Web AI
In September 2023 Adobe formally announced that Adobe Photoshop Web is out of beta and launched! The TensorFlow.js team have been working with Adobe since earlier in the year and it is great to see this launch - well done to Adobe!
Adobe are already embracing hybrid Web AI approaches it seems, running their epic Firefly Gen AI model in the cloud, while moving things like object selection tools that are also AI powered to the client side using Web ML models in the browser. Hybrid is essentially here already!
领英推荐
Web LLM
Want to use Llama 2 7B or 14B directly in the browser? Yeah, I thought you might you Gen AI addict you. Well thanks to this Web LLM project by Tianqi Chen and Hongyi Jin, you now can, with ease too. Give it a go on their website and learn how to integrate into your own projects:
Oh and here is the Github incase you want to go straight to the code.
WebShap - explain any ML model in your browser
A JavaScript library that can explain any machine learning model on the Web. Yep, you heard me right. Check this demo for a visual example:
Read this direct quote for more juicy details though:
For better privacy and lower latency, researchers and developers have made strides in developing Web ML models that can run on clients' edge devices. To reflect the "transparency and explainability" ethical principle of Web ML, WebSHAP is the first JavaScript library that brings Kernel SHAP to the Web environment—enabling developers and end-users to explain any ML models anywhere.
Oh ok, pretty neat but what's that Kernel SHAP they mentioned? Well they explain that too:
Kernel SHAP is the state-of-the-art model-agnostic explainability technique to help ML practitioners and end-users understand how ML models make predictions. For any given ML model, Kernel SHAP computes the contribution score of each input feature. Developed with modern web technologies, such as WebGL and Web Workers, WebSHAP empowers users to run Kernel SHAP directly in their browsers, providing a ubiquitous, private, and interactive explanation experience.
Sounds promising. Give it a whirl on their website - this could be very useful to help with explainability of your models. You can also find the Github code here.
South East Asia embraces Web AI
One of the key factors in my missing writing the November/December editions of this newsletter was because I was teaching Web AI to folk over in south east Asia. It was a pleasure meeting you all IRL - over 3,000 of you in person! From Indonesia to Malaysia it was great to see such passion and so many JavaScript devs looking to level up their skills. The teachers of tomorrow are being made right now. Some highlights below.
AI/ML & Data Talks Podcast
Thanks to Poo Kuan Hoong, Ph.D for inviting me to his AI/ML Data talks podcast. Check our interview here which covers quite a bit from my own background to future thoughts on Web AI:
If you enjoyed that he has plenty more interviews to check out too!
Indonesia - Yogyakarta
Malaysia - Kuala Lumpur
Malaysia - George Town
See you next time!
Finally, as always, if you've made something cool or seen a demo out in the wild, be sure to tag it with #WebML or #WebAI on LinkedIn / Twitter / social so we can find it for a chance to be in our newsletter, future events, or even our YouTube show and tell!
I shall leave you with this photo of my Web ML family here at Google at our 2nd ever summit this year. Can you spot me? Much love to you all and happy 2024!
If you are new to this space and want to learn Web AI, you can get started fast with my free Google Developers course here (no background in AI needed, just a love for JavaScript and curiosity for AI). I got you!
See you next time with even more great content. Cheers!
Jason Mayes
Product Software Engineer | Angel Investor | Speaker and Mentor @ETH Global | Microsoft Learn Contributor, Azure AI Coach, Gold MLSA | Organizing Team @Web3Conf India | Bridging Web3-Data Science
10 个月Loved going through it, Jason!
Thanks for the insights!
A very comprehensive report. Very informative insights. Have been motivated to explore Web ML more. Looking forward to future updates.
Web AI Lead @Google 13+yrs. Research & Machine Learning | On-device Artificial Intelligence | Chrome | TensorFlow.js | MediaPipe. ?? Web Engineering + innovation ??
10 个月Paul Kinlan Addy Osmani Sharbani R. Jeanine Banks Laurence Moroney Julien Chaumond Joshua Lochner FYI ??