SundAI, your weekly overdose of artificial intelligence news: week 51

SundAI, your weekly overdose of artificial intelligence news: week 51

Welcome back to SundAI !

This week in AI was a dream on growth hormones, laced with crack cocaine and topped-up with amphetamines, and it was faster, bigger, and weirder than ever. DeepMind decided to shine with their new Gemini Flash 2.0 playthingy, and a mind-bending video generator, and OpenAI continued its “12 Days of Overhyped Features”, with video input in ChatGPT. And, Microsoft’s quietly snuck Phi-4 into the mix of models, with which it proved that smaller models can punch like heavyweights. And just when you thought things couldn’t get more unreal, them folks at Google announced their quantum computing chip, Willow, which makes a septillion-year feel like a quick walk in the park.

So, plug in your VPN, or strap-on your Mixnet (Hi Aleksey Malankin ??), and let’s get into this chaotic AI week 51.

Let’s rock ‘n roll.


TL;DR for the ones with attention deficit issues

  • DeepMind’s Gemini Flash 2.0 leveled up the multimodal game. It proved that smaller, faster models are the future of AI efficiency.
  • OpenAI added video features to ChatGPT Advanced Voice Mode. They are making sure that their chatbot can now do everything except wash your undies.
  • Microsoft’s Phi-4 showed off with its STEM-savvy reasoning. It was outperforming some of the biggest players in AI. And the cool thang is that it did all of AND while keeping it compact and efficient.
  • Google’s Willow quantum chip casually solved problems older than the universe in under five minutes. No biggie.
  • Cohere, Pika Labs, and Apple also threw their hats into the ring and treated us with some updates ranging from NLP models to generative emojis (yeey, Apple!).

Shorter than short: AI’s end-of-year sprint is progressing full speed, and it is impossible to keep up without a second (or third) can of Monster.


More after the commercial brake:


  1. Comment, or share the article; that will really help spread the word ??
  2. Connect with me on Linkedin ??
  3. Subscribe to TechTonic Shifts to get your daily dose of tech ??
  4. TechTonic Shifts has a new blog, full of rubbish you will like !


Lalalala, blablablaaaa hmm hhmm gave to me, two blablablabla, three blablablibla.....and a pear treeee?

I couln’t find the name of this Christmas rhyme, so if anyone can help out, you win a custom made Dall-E picture of me.

But what I do know about the holidays is that it is a time for giving, receiving, and... getting scammed

Scammers have upped their game this year, and they are using AI to ram fake ads up your…, launch sketchy shopping sites, and send your mom or dad masses of texts that steal their credit card details. Banks are already reporting a spike in fraud, so maybe double-check that “50% off designer lingerie for men” ad before clicking.


The AI job market is through the roof

Oh, here’s a feel-good one. The AI job market is set to explode in 2025. Companies are apparently scrambling to hire folks who actually KNOW HOW to make these models work. So first they ditch all their engineers, and now they double back to rehire them, who - in the mean time - have reskilled and resupplied themselves, and are asking twice as much.

Machine learning, AI implementation, and transformation roles are in high demand, but finding the right qualified candidates is harder than getting access to Sora on launch day. Non-tech companies are jumping on the AI bandwagon too, so if you’ve got skills, 2025 will be your year.

Source


Gemini Flash 2.0 and Veo 2: DeepMind’s answer to OpenAI

DeepMind was not fooling around this week. Gemini Flash 2.0 just went on stage, and it’s kicking the butts of its predecessors quite hard. Flash 2.0 is a leaner cousin of Gemini 1.0, but with better Oompa Loomps and faster inference than it’s competitor’s oversized models. This new baby flushed benchmarks like it was faxing a turd to Putin, with scores like:

  • MMMU Image Understanding: 70.7% (up from 59.4% last year).
  • MMLU Pro: 76.4% Who needs subtlety if you can outscore the competition?

And when you thought that DeepMind would take a breather, they launched their new V2-rocket: Veo 2 right into the stratosphere. Veo 2 is a text-to-video model that is generating 4K video that is so real that it might trick your eyes into thinking that you’re living in a simulation (by the way, there’s more research on the simulation front as well, but that’s for another time). Veo 2 even simulates physics, which makes it perfect for anyone needing realistic videos of calculating bullet trajectories or whatever else you concoct in your dreams.

Source

Oh, before I forget, the company also announced?Deep Research. That’s a tool for researching complex topics within Gemini advanced. This for me, is a good reason to ditch unnecessary subscriptions and buy into this pearl of a jewel of a gem.

Google’s Trillium AI accelerator chip

Aaaaand Google also introduced the Trillium AI accelerator chip this week. It’s kinda like Willow’s but less flashy and ridiculously efficient (translate: they have good Oompas). It has been assaulting benchmarks like an addict on Narcsn, and it is making Google look like the Usain Bolt of AI hardware.


OpenAI keeps the hype alive with ChatGPT video features

OpenAI continued with its 12 Days of AI Christmas with a ChatGPT Advanced Voice Mode update that adds video input and screen sharing. Basically, your chatbot can now act like a FaceTime buddy, except that this one identifies objects in your background and judges your IKEA bookshelf. It is of course exclusive to Plus (or Premium, I forgot, excusez moi) and Pro users for now, but this feature is another notch in OpenAI’s plan to make ChatGPT your digital ummm everything.

Source

Oh, yeah, let’s not forget OpenAI o3

OpenAI decided it had enough with its 12 Days of AI Christmas and without any publicity, they launched o3, and that is the sequel to o1 - if you did not learn how to count in grade school - but o3 is apparently better at math, science, and step-by-step problem-solving. A bit like everyone else, besides me, I guess. It is designed to tackle complex tasks that will make Google Gemini sweat hard under its twin armpits.

Aaand this is the overview of what they ejected during their 12 days of Xmas (hope Elon doesn’t pick up on this pun, else he’ll steal Christmas as well, like the Grinch he is):

  1. ChatGPT Pro and o1 release: Pay to play, people ! 200 bucks a month.
  2. Sora video generation model: Turn your dreams into 1080p nightmares.
  3. Canvas development tool: Perfect for overachievers and control freaks.
  4. Apple Intelligence integration: Siri gets a brain transplant but still delivers crap.
  5. Advanced voice mode, now with video & santa mode: Ho-ho-horrifying chats.
  6. Projects in ChatGPT: Finally, a way to organize your procrastination into neat little folders.
  7. ChatGPT search: Real-time Googling, now with judgment from your chatbot.
  8. Holiday treats for developers: API tweaks and "festive" debugging over eggnog.
  9. 1-800-CHATGPT: A hotline to talk to your AI therapist (no casting couch required).
  10. Work with apps: ChatGPT is in your apps now. Resistance is futile.
  11. Early access for safety testing: AKA guinea pig mode.
  12. Finale: The fireworks of announcements through hours and hours of streaming dread

And a dystopian future in a pear tree….


Microsoft’s welterweight belt model punches above its division

Microsoft’s Phi-4 is proof that size doesn’t always matter.

Uhhh…

Continuing… It has (just shy of) 14 billion parameters, and with it, this model is smoking bigger competitors like GPT-4 on topics like math and science tasks and it still has some room left for dessert. Phi-4 is coming soon to HuggingFace as well, so get ready to see what happens when you mix synthetic data with a dash of machine learning.

And yes, it was fully trained with synthehol! (Star Trek pun, ya dweeps)

Source


Apple’s Siri gets an umbilical to ChatGPT

Apple finally decided that Siri needed a walking cane and screwed a bit of ChatGPT integration in it. If you were so bold enough to run the latest iOS update, AND you do NOT live in the EU or China, you now get Siri to dabble in generative AI. And oh yeah, Apple has finally joined the big league of genAI players when it introduced Genmoji (custom AI emojis) and Image Playground. At least they’re showing up to the AI party: Apple Intelligence is late to the AI party and brought us… a new set of emoji’s

Source


Google spits qbits in the face of researchers

Want a quantum dominatrix? Rent Willow for a couple of hours and she will run rings around you. Apparently 10 reptilian years in 5 minutes, although I think a gag ball is much cheaper.

Google’s quantum chip Willow is making every other computer on the planet look like a dried up snail. This chip completed a task in 5 minutes that would take a supercomputer 10 septillion years. I even had to look it up, and that is apparently longer than the universe has existed. The implications are huge, but let’s be honest, the real question is: how soon until someone uses this for meme videos. Google tears a hole in the fabric of space-time and proves we live in a multiverse where everything still sucks

Source


Grok Is now free for all X abusers

Elon Musk just made Grok free for X’s non-premium users. Yeeeeey. I played with it when it was NOT free, and I could NOT see any value in it. Just another FOMO AI. Though X is free, it still is suffocating the life out of you with their 10 messages every two hours cap.

It has got its limitations, and this model is as mediocre as you can get, but with Grok going wide, Elon’s chatbot might finally have a shot at stealing a teensy weensy market share from ChatGPT. Whether it’s enough to compete with OpenAI, Google, and Microsoft remains to be seen. I think not. But hey, that’s why you have courts, aight…Welcome to the soap opera called "Open"-AI, starring Sam, Elon, and Mark, with heckling by Statler and Waldorf.

Source


SoftBank’s $100 billion investment in …. AI

Da fuck, was my first reaction when I read this in my RSS feed. You know $10 billion investment isn’t cool anymore, but blowing the bank with $100 billion is. SoftBank’s CEO Masayoshi Son announced this commitment to invest $100 billion in U.S. (alas) tech. It is for the most part AI-focused, and it is spanning about four years. It is said (not me!) that this mega-investment will create 100,000 jobs, so if you’re in the States, it’s time to digitize that paper CV and start dreaming big. But don’t take my word for it!

Source


The hottest AI News you weren’t looking for

  1. Google’s Gemini 2.0 Flash: Multimodal, multilingual, and built for agentic applications. Basically, this model wants to be the all- you-can-eat buffet of AI.
  2. OpenAI’s Sora goes Remi: Know Remi - alone in the world? Look it up! OpenAIght created text-to-video in 1080p glory with Sora, it’s newest standalone product, and watermarks included.
  3. Cohere command R7B: Cool name, and also the fastest, smallest member of Cohere’s lineup, and that is perfect for enterprises that don’t want to break the bank on massive models.
  4. OpenAI Projects: ChatGPT gets folders! Finally! Now you can organize…chats? and also files, if you don’t have a file system on your computer.
  5. Pika Labs 2.0: AI video production just got a tad sharper, with new scene Ingredients for better storytelling. It is said that this rivals Sora. I have tried it. Skip it.


Why should you even care about all these lame developments?

DeepMind’s Gemini Flash 2.0 is kind of a game-changer, but not because it is breaking benchmarks (again), but because it’s doing so on smaller, faster models that cost less to run. Less energy and less water means more time for us to breath clean air. And lower latency and higher efficiency also mean that these models are primed for real-time, agentic applications, which is the next hype of 2025. Yeeey ! AI assistants on your phone, browser, and wearables, as if I don’t have enough shit to worry about.

Oh, before I forget, OpenAI’s and Microsoft’s updates means that they are emphasizing a tad more on pesky things like accessibility and usability. The competition is warming up for a second round of agentic fun in 2025, and that will mean more innovation at breakneck speed.


Reads/vids to pretend you’re working, while secretly dozing off

  1. The epic history of LLMs: A movie about how we got from simple RNNs to the ChatGPTs of today. As if we haven’t all seen a video about it the last few years.
  2. Multimodal RAG applications: With this video, you learn how to retrieve both text and images using vector stores. One modality is for loosers.
  3. How to actually build useful AI products: Good question, and the best read out there. Forget integrating AI in toilet paper to revolutionize wiping your butt. This article lays out what makes AI products actually useful.
  4. Run Gemini via OpenAI API: Yes, Google’s model works with OpenAI’s framework, and here’s the code to prove it.
  5. AI Tooling for Software Engineers in 2024: It is actually quite a usefull reality check on which tools are working, which ones are failing, and what’s just hype, and all in the grand effort to replace developers and creatives with zeroes and ones.


Repositories, tools, copy *.*

  1. MarkItDown: Convert files to Markdown like a pro. Why? Dunno.
  2. HunyuanVideo: A framework for large-scale video generation.
  3. DeepSeek-VL2: Vision-language models with MoE magic.
  4. TEN Agent: Your new conversational AI buddy, integrating Gemini’s Live API.
  5. Loveable. A generative AI development tool to help you go from Minimal Viable Product to Minimal Loveable Product. No coding skills required. Only English. Love it!


Research papers of the week

  1. Phi-4 technical report
  2. ReFT: Representation Finetuning for Language Models (here you go!)
  3. Training Large Language Models to reason in a continuous impotent space (sorry, but didn’t think you would get this far anyway)
  4. GenEx: Generating an explorable world
  5. FlashAttention on a napkin


Quicky links

  1. Harvard and Google’s public-domain AI library
  2. Meta Motivo: Making avatars dance
  3. Pika Labs 2.0: Sharper AI video tools


So, that’s it for this week. It was yet another wild ride in the AI rollercoaster. So I’ll be seeing you coming week, if the bots haven’t taken over TTS by then!

Signing off from the trenches of AI, where Siri pretends to help, Willow runs the show, and supercomputers cry in a corner,

Marco


Well, that’s a wrap for today. Tomorrow, I’ll have a fresh episode of TechTonic Shifts for you. If you enjoy my writing and want to support my work, feel free to buy me a coffee ??

Think a friend would enjoy this too? Share the newsletter and let them join the conversation. LinkedIn appreciates your likes by making my articles available to more readers.



To keep you doomscrolling ??


  1. Brace, brace brace! AI takes the stick at Heathrow’s air traffic control center | LinkedIn
  2. AI is a compulsive liar | LinkedIn
  3. In 2025, AI needs to put up or just shut up! | LinkedIn
  4. A 17 yo brat created a $1M/month app. Here’s how he did it. | LinkedIn
  5. This is a eulogy for chegg. Gone but not forgotten (unless you’re a student, then definitely otten) | LinkedIn
  6. Musk wants to make games great again | LinkedIn
  7. The great tech wake-up call: Developers, meet the dystopia you helped build | LinkedIn
  8. Flamethrower dogs, kamikaze cars, and bomb-planting humanoids. | LinkedIn
  9. Objection! Your honor, ChatGPT made me do it | LinkedIn
  10. A cautionary tale about an AI unicorn that turns into a fraudulent little pwny | LinkedIn
  11. Meet Daisy, the AI Granny who’s here to waste scammers’ lives | LinkedIn
  12. AI Search Engine Optimization | LinkedIn
  13. I’ve seen the dark side of AI, and you need to know about it | LinkedIn


Divanshu Anand

Enabling businesses increase revenue, cut cost, automate and optimize processes with algorithmic decision-making | Founder @Decisionalgo | Head of Data Science @Chainaware.ai | Former MuSigman

2 个月

This week's AI news was indeed mind-blowing! From DeepMind's Gemini Flash 2.0 pushing multimodal boundaries to Google's Willow quantum chip solving tasks in minutes, the AI landscape is evolving faster than ever. Also, OpenAI's new video input features and Microsoft’s Phi-4 model prove that innovation is accelerating at a breakneck pace. Get ready for 2025, because the AI market is booming with more groundbreaking technologies! The future is happening now!

Marco van Hurne

AI & ML advisory | Author of The Machine Learning Book of Knowledge | Building AI skills & organizations | Data Science | Data Governance | AI Compliance Officer | AI Governance

2 个月

Welcome friends! Always happy to see your faces here again, especially this time of year, when it’s dark at four, this feels like a warm homecoming!

要查看或添加评论,请登录

Marco van Hurne的更多文章

社区洞察

其他会员也浏览了