Web ML Monthly #14: India loves TensorFlow.js, 3 new demos, Meta AI runs segment anything in browser!
Jason Mayes
Web AI Lead @Google 13+yrs. Agent / LLM whisperer. On-device Artificial Intelligence / Machine Learning using Chrome | TensorFlow.js | MediaPipe. ?? Web Engineering + innovation ??
Hey Tensors!?August is upon us - time sure does fly, as does our weekly downloads growth that just touched ~300K! If you enjoy the content, please do give us a share with friends, colleagues, family, dogs, and cats - everyone is welcome.?Let's go!
Meta AI's Segment Anything (SAM) model runs in browser in real time!
It was only a matter of time before someone embraced Web ML over at Meta AI and they have done an awesome job running their Segment Anything model (SAM) efficiently in the browser. SAM efficiently maps the image features and a set of prompt embeddings to produce a segmentation mask and can run at around 50ms in the browser on CPU it seems. Check this out:
The need to run in browser client side was due to cloud based latency issues based on this quote from their write up - one of the many super powers of Web AI that you get by running client side:
"The model needs to run in real time on a CPU in a web browser to allow our annotators to use SAM interactively in real time to annotate efficiently"
This sort of technology could help folk who are collecting and annotating data for new ML models to perform their jobs faster than ever before. The possibilities in the future expand far and wide too as stated by Meta AI:
"SAM could become a component in larger AI systems for more general multimodal understanding of the world, for example, understanding both the visual and text content of a webpage"
I am sure excited to see how this plays out or even mixed with WebXR in the future - imagine gazing at an object and being told information about it with a set of WebXR enabled AR Glasses -this is totally possible - just replace their demo that uses mouse hover to use eye gaze instead! I look forward to see what other models Meta AI manages to port to run entirely in browser with the power of Web ML.
Try the web demo yourself: https://segment-anything.com/demo - Select an image and use your mouse to hover over objects in that image - all segmentation when hovering happens locally in browser on your machine - no cloud!
Learn more: Read the full write up from Meta AI here.
Segment Anything Homepage: https://segment-anything.com/
New Web ML demos for your viewing pleasure
General purpose hand tracking for Three.js
The amazing @KMKota0 over on Twitter has created this very impressive demo that's designed to use your hands to interact with 3D scenes made in the popular Three.js library:
Looks super cool, is powered by our MediaPipe hand tracking model, and tonnes of potential for the future too. Go give it some love - how would you use something like this in your next game or digital experience?
Keep yourself centered
We've all been there, in a meeting on a video call, moving around and before you know it you're off to the far right of the camera and everyone is wondering why you keep moving so much... Well Screen Run App is here to the rescue releasing their new Web ML feature that keeps your face center of attention in any video call:
A solid upgrade to any web based video conferencing app - I hope to see more folk embrace Web ML in their web apps soon, *cough*, ZOOM, *cough* - any subscribers here work at ZOOM? Please tell their web team to have parity with native features ;-) Web ML is here and usable right now - no reason not to.
Whisper Web - ML-powered speech recognition directly in your browser
The infamous Joshua Lochner (original creator of Transformers.js), now at Hugging Face , has released a legendary demo that can transcribe your audio recordings super fast right in the browser powered by Web ML tech! This one you have got to try for yourself as it's really awesome:
Imagine being able to transcribe a video call that is powered by WebRTC right in the browser to get meeting notes while maintaining your privacy. Now you can. Or what about speaking to a game and passing the text transcription to an LLM for interpretation. You can now also do that. What will you make?
领英推è
Try it for yourself here: https://huggingface.co/spaces/Xenova/whisper-web
India tour - 4 talks reaching ~1500 people IRL!
If you were wondering why my socials were a bit slow over the last month, well that's because I was out in the field speaking at 4 amazing events over in India across the country spreading the love for Web ML generally, along with TensorFlow.js, MediaPipe, and Visual Blocks!
1. Google IO Connect Bengaluru
First up, the official Google IO connect event was hosted at the end of June, with over 2800 or so attendees. I gave a talk on the latest advances in the Google Web ML space along with hosting a very popular demo booth with my colleagues which was a huge hit with attendees.
It was so busy I do believe my colleague and I had no voice left by the end of the day. It was a real honour to meet so many of you who I had crossed paths with here on LinkedIn or through my Google Developers course where many of you took your first steps in learning Web ML - grateful to have put some faces to the screen names!
Thanks for all the questions and passion (and kudos to the folk who travelled from North India just to meet in person for this event). I hope to see you again next year.
2. Mysuru - Vidyavardhaka College of Engineering
Up next I headed on to meet university students over in Mysuru at Vidyavardhaka College of Engineering to educate them on the importance of AI/ML and how Web ML can play a role in their future projects, ideas, and creations.
Thank you to Usha C S for hosting me and helping to spread the love for TensorFlow.js!
3. Dev Days Hyderabad
My third talk was hosted by Swecha Telangana for #DevDaysHyd where we had hundreds of folk join for a full day session covering a whole host of topics including Web ML of course. A sneak peak from the stage view as folk were settling in:
Thank you to Ranjith Raj Vasam for helping to organize the event and getting such a great turn out - full house!
4. GDG New Delhi
My final talk in India was to?GDG New Delhi. Dispite the huge flooding due to monsoons (some attendees took over 3 hours to get to the venue - now that's dedication) we asked the audience at the end of the talk what they thought about Web ML / web tech these days and the response was overwhelmingly "Awesome!" - so much love!
See you next time!
Finally, if you've made something cool or seen a demo out in the wild, be sure to tag it with?#WebML?or #WebAI on LinkedIn / Twitter / social so we can find it for a chance to be in our newsletter, future events, or even our?YouTube show and tell!
See you next month with even more great?#WebML?content that's?#MadeWithTFJS?and beyond. Cheers!
Jason
Data Analyst | Computer Engineer | Machine Learning Engineer
1 å¹´I also saw your work on body measurements using tensorflow.js, and I would like to know more about it Jason Mayes
Data Analyst | Computer Engineer | Machine Learning Engineer
1 å¹´Hi Jason. This is nice.
Like to be Spiderman, Hulk, Dr. strange, but many situations not make a normal person live like them.
1 å¹´I love it more
PhD Computer Science & Engineering | Snr. Lecturer at Ural Federal Uni. | Data Scientist (R | Python | Power BI | SQL | Knime) | Consultant | Data-Driven Decision Making in Complex Systems | ICT4D, Digital Transformation
1 å¹´This is good content! Thanks Jason! Looking forward to the next edition!
Web AI Lead @Google 13+yrs. Agent / LLM whisperer. On-device Artificial Intelligence / Machine Learning using Chrome | TensorFlow.js | MediaPipe. ?? Web Engineering + innovation ??
1 å¹´Sharbani R. Jeanine Banks Laurence Moroney Meenu Gaba next edition is out!