登录查看更多内容

SundAI, your weekly overdose of artificial intelligence news: week 51

Marco van Hurne

AI & ML advisory | Author of The Machine Learning Book of Knowledge | Building AI skills & organizations | Data Science | Data Governance | AI Compliance Officer | AI Governance

发布日期: 2024年12月22日

Welcome back to SundAI !

This week in AI was a dream on growth hormones, laced with crack cocaine and topped-up with amphetamines, and it was faster, bigger, and weirder than ever. DeepMind decided to shine with their new Gemini Flash 2.0 playthingy, and a mind-bending video generator, and OpenAI continued its “12 Days of Overhyped Features”, with video input in ChatGPT. And, Microsoft’s quietly snuck Phi-4 into the mix of models, with which it proved that smaller models can punch like heavyweights. And just when you thought things couldn’t get more unreal, them folks at Google announced their quantum computing chip, Willow, which makes a septillion-year feel like a quick walk in the park.

So, plug in your VPN, or strap-on your Mixnet (Hi Aleksey Malankin ??), and let’s get into this chaotic AI week 51.

Let’s rock ‘n roll.

TL;DR for the ones with attention deficit issues

DeepMind’s Gemini Flash 2.0 leveled up the multimodal game. It proved that smaller, faster models are the future of AI efficiency.
OpenAI added video features to ChatGPT Advanced Voice Mode. They are making sure that their chatbot can now do everything except wash your undies.
Microsoft’s Phi-4 showed off with its STEM-savvy reasoning. It was outperforming some of the biggest players in AI. And the cool thang is that it did all of AND while keeping it compact and efficient.
Google’s Willow quantum chip casually solved problems older than the universe in under five minutes. No biggie.
Cohere, Pika Labs, and Apple also threw their hats into the ring and treated us with some updates ranging from NLP models to generative emojis (yeey, Apple!).

Shorter than short: AI’s end-of-year sprint is progressing full speed, and it is impossible to keep up without a second (or third) can of Monster.

More after the commercial brake:

Comment, or share the article; that will really help spread the word ??
Connect with me on Linkedin ??
Subscribe to TechTonic Shifts to get your daily dose of tech ??
TechTonic Shifts has a new blog, full of rubbish you will like !

Lalalala, blablablaaaa hmm hhmm gave to me, two blablablabla, three blablablibla.....and a pear treeee?

I couln’t find the name of this Christmas rhyme, so if anyone can help out, you win a custom made Dall-E picture of me.

But what I do know about the holidays is that it is a time for giving, receiving, and... getting scammed

Scammers have upped their game this year, and they are using AI to ram fake ads up your…, launch sketchy shopping sites, and send your mom or dad masses of texts that steal their credit card details. Banks are already reporting a spike in fraud, so maybe double-check that “50% off designer lingerie for men” ad before clicking.

The AI job market is through the roof

Oh, here’s a feel-good one. The AI job market is set to explode in 2025. Companies are apparently scrambling to hire folks who actually KNOW HOW to make these models work. So first they ditch all their engineers, and now they double back to rehire them, who - in the mean time - have reskilled and resupplied themselves, and are asking twice as much.

Machine learning, AI implementation, and transformation roles are in high demand, but finding the right qualified candidates is harder than getting access to Sora on launch day. Non-tech companies are jumping on the AI bandwagon too, so if you’ve got skills, 2025 will be your year.

Source

Gemini Flash 2.0 and Veo 2: DeepMind’s answer to OpenAI

DeepMind was not fooling around this week. Gemini Flash 2.0 just went on stage, and it’s kicking the butts of its predecessors quite hard. Flash 2.0 is a leaner cousin of Gemini 1.0, but with better Oompa Loomps and faster inference than it’s competitor’s oversized models. This new baby flushed benchmarks like it was faxing a turd to Putin, with scores like:

MMMU Image Understanding: 70.7% (up from 59.4% last year).
MMLU Pro: 76.4% Who needs subtlety if you can outscore the competition?

And when you thought that DeepMind would take a breather, they launched their new V2-rocket: Veo 2 right into the stratosphere. Veo 2 is a text-to-video model that is generating 4K video that is so real that it might trick your eyes into thinking that you’re living in a simulation (by the way, there’s more research on the simulation front as well, but that’s for another time). Veo 2 even simulates physics, which makes it perfect for anyone needing realistic videos of calculating bullet trajectories or whatever else you concoct in your dreams.

Source

Oh, before I forget, the company also announced?Deep Research. That’s a tool for researching complex topics within Gemini advanced. This for me, is a good reason to ditch unnecessary subscriptions and buy into this pearl of a jewel of a gem.

Google’s Trillium AI accelerator chip

Aaaaand Google also introduced the Trillium AI accelerator chip this week. It’s kinda like Willow’s but less flashy and ridiculously efficient (translate: they have good Oompas). It has been assaulting benchmarks like an addict on Narcsn, and it is making Google look like the Usain Bolt of AI hardware.

OpenAI keeps the hype alive with ChatGPT video features

OpenAI continued with its 12 Days of AI Christmas with a ChatGPT Advanced Voice Mode update that adds video input and screen sharing. Basically, your chatbot can now act like a FaceTime buddy, except that this one identifies objects in your background and judges your IKEA bookshelf. It is of course exclusive to Plus (or Premium, I forgot, excusez moi) and Pro users for now, but this feature is another notch in OpenAI’s plan to make ChatGPT your digital ummm everything.

Source

Oh, yeah, let’s not forget OpenAI o3

OpenAI decided it had enough with its 12 Days of AI Christmas and without any publicity, they launched o3, and that is the sequel to o1 - if you did not learn how to count in grade school - but o3 is apparently better at math, science, and step-by-step problem-solving. A bit like everyone else, besides me, I guess. It is designed to tackle complex tasks that will make Google Gemini sweat hard under its twin armpits.

Aaand this is the overview of what they ejected during their 12 days of Xmas (hope Elon doesn’t pick up on this pun, else he’ll steal Christmas as well, like the Grinch he is):

ChatGPT Pro and o1 release: Pay to play, people ! 200 bucks a month.
Sora video generation model: Turn your dreams into 1080p nightmares.
Canvas development tool: Perfect for overachievers and control freaks.
Apple Intelligence integration: Siri gets a brain transplant but still delivers crap.
Advanced voice mode, now with video & santa mode: Ho-ho-horrifying chats.
Projects in ChatGPT: Finally, a way to organize your procrastination into neat little folders.
ChatGPT search: Real-time Googling, now with judgment from your chatbot.
Holiday treats for developers: API tweaks and "festive" debugging over eggnog.
1-800-CHATGPT: A hotline to talk to your AI therapist (no casting couch required).
Work with apps: ChatGPT is in your apps now. Resistance is futile.
Early access for safety testing: AKA guinea pig mode.
Finale: The fireworks of announcements through hours and hours of streaming dread

And a dystopian future in a pear tree….

Microsoft’s welterweight belt model punches above its division

Microsoft’s Phi-4 is proof that size doesn’t always matter.

Uhhh…

Continuing… It has (just shy of) 14 billion parameters, and with it, this model is smoking bigger competitors like GPT-4 on topics like math and science tasks and it still has some room left for dessert. Phi-4 is coming soon to HuggingFace as well, so get ready to see what happens when you mix synthetic data with a dash of machine learning.

And yes, it was fully trained with synthehol! (Star Trek pun, ya dweeps)

Source

领英推荐

Can AI read my mind?

Strongstep - Innovation in software quality 5 个月前

(#08) Beyond Sci-Fi: 13 Breakthroughs in AI You Can’t…

Manish Shah 3 个月前

The ChenInstitute's OMNE framework just revolutionized…

Tulsi Soni 2 个月前

Apple’s Siri gets an umbilical to ChatGPT

Apple finally decided that Siri needed a walking cane and screwed a bit of ChatGPT integration in it. If you were so bold enough to run the latest iOS update, AND you do NOT live in the EU or China, you now get Siri to dabble in generative AI. And oh yeah, Apple has finally joined the big league of genAI players when it introduced Genmoji (custom AI emojis) and Image Playground. At least they’re showing up to the AI party: Apple Intelligence is late to the AI party and brought us… a new set of emoji’s

Source

Google spits qbits in the face of researchers

Want a quantum dominatrix? Rent Willow for a couple of hours and she will run rings around you. Apparently 10 reptilian years in 5 minutes, although I think a gag ball is much cheaper.

Google’s quantum chip Willow is making every other computer on the planet look like a dried up snail. This chip completed a task in 5 minutes that would take a supercomputer 10 septillion years. I even had to look it up, and that is apparently longer than the universe has existed. The implications are huge, but let’s be honest, the real question is: how soon until someone uses this for meme videos. Google tears a hole in the fabric of space-time and proves we live in a multiverse where everything still sucks

Source

Grok Is now free for all X abusers

Elon Musk just made Grok free for X’s non-premium users. Yeeeeey. I played with it when it was NOT free, and I could NOT see any value in it. Just another FOMO AI. Though X is free, it still is suffocating the life out of you with their 10 messages every two hours cap.

It has got its limitations, and this model is as mediocre as you can get, but with Grok going wide, Elon’s chatbot might finally have a shot at stealing a teensy weensy market share from ChatGPT. Whether it’s enough to compete with OpenAI, Google, and Microsoft remains to be seen. I think not. But hey, that’s why you have courts, aight…Welcome to the soap opera called "Open"-AI, starring Sam, Elon, and Mark, with heckling by Statler and Waldorf.

Source

SoftBank’s $100 billion investment in …. AI

Da fuck, was my first reaction when I read this in my RSS feed. You know $10 billion investment isn’t cool anymore, but blowing the bank with $100 billion is. SoftBank’s CEO Masayoshi Son announced this commitment to invest $100 billion in U.S. (alas) tech. It is for the most part AI-focused, and it is spanning about four years. It is said (not me!) that this mega-investment will create 100,000 jobs, so if you’re in the States, it’s time to digitize that paper CV and start dreaming big. But don’t take my word for it!

Source

The hottest AI News you weren’t looking for

Google’s Gemini 2.0 Flash: Multimodal, multilingual, and built for agentic applications. Basically, this model wants to be the all- you-can-eat buffet of AI.
OpenAI’s Sora goes Remi: Know Remi - alone in the world? Look it up! OpenAIght created text-to-video in 1080p glory with Sora, it’s newest standalone product, and watermarks included.
Cohere command R7B: Cool name, and also the fastest, smallest member of Cohere’s lineup, and that is perfect for enterprises that don’t want to break the bank on massive models.
OpenAI Projects: ChatGPT gets folders! Finally! Now you can organize…chats? and also files, if you don’t have a file system on your computer.
Pika Labs 2.0: AI video production just got a tad sharper, with new scene Ingredients for better storytelling. It is said that this rivals Sora. I have tried it. Skip it.

Why should you even care about all these lame developments?

DeepMind’s Gemini Flash 2.0 is kind of a game-changer, but not because it is breaking benchmarks (again), but because it’s doing so on smaller, faster models that cost less to run. Less energy and less water means more time for us to breath clean air. And lower latency and higher efficiency also mean that these models are primed for real-time, agentic applications, which is the next hype of 2025. Yeeey ! AI assistants on your phone, browser, and wearables, as if I don’t have enough shit to worry about.

Oh, before I forget, OpenAI’s and Microsoft’s updates means that they are emphasizing a tad more on pesky things like accessibility and usability. The competition is warming up for a second round of agentic fun in 2025, and that will mean more innovation at breakneck speed.

Reads/vids to pretend you’re working, while secretly dozing off

The epic history of LLMs: A movie about how we got from simple RNNs to the ChatGPTs of today. As if we haven’t all seen a video about it the last few years.
Multimodal RAG applications: With this video, you learn how to retrieve both text and images using vector stores. One modality is for loosers.
How to actually build useful AI products: Good question, and the best read out there. Forget integrating AI in toilet paper to revolutionize wiping your butt. This article lays out what makes AI products actually useful.
Run Gemini via OpenAI API: Yes, Google’s model works with OpenAI’s framework, and here’s the code to prove it.
AI Tooling for Software Engineers in 2024: It is actually quite a usefull reality check on which tools are working, which ones are failing, and what’s just hype, and all in the grand effort to replace developers and creatives with zeroes and ones.

Repositories, tools, copy .

MarkItDown: Convert files to Markdown like a pro. Why? Dunno.
HunyuanVideo: A framework for large-scale video generation.
DeepSeek-VL2: Vision-language models with MoE magic.
TEN Agent: Your new conversational AI buddy, integrating Gemini’s Live API.
Loveable. A generative AI development tool to help you go from Minimal Viable Product to Minimal Loveable Product. No coding skills required. Only English. Love it!

Research papers of the week

Phi-4 technical report
ReFT: Representation Finetuning for Language Models (here you go!)
Training Large Language Models to reason in a continuous impotent space (sorry, but didn’t think you would get this far anyway)
GenEx: Generating an explorable world
FlashAttention on a napkin

Quicky links

So, that’s it for this week. It was yet another wild ride in the AI rollercoaster. So I’ll be seeing you coming week, if the bots haven’t taken over TTS by then!

Signing off from the trenches of AI, where Siri pretends to help, Willow runs the show, and supercomputers cry in a corner,

Marco

Well, that’s a wrap for today. Tomorrow, I’ll have a fresh episode of TechTonic Shifts for you. If you enjoy my writing and want to support my work, feel free to buy me a coffee ??

Think a friend would enjoy this too? Share the newsletter and let them join the conversation. LinkedIn appreciates your likes by making my articles available to more readers.

To keep you doomscrolling ??

TechTonic Shifts

2,557 位关注者

Divanshu Anand

Enabling businesses increase revenue, cut cost, automate and optimize processes with algorithmic decision-making | Founder @Decisionalgo | Head of Data Science @Chainaware.ai | Former MuSigman

2 个月

This week's AI news was indeed mind-blowing! From DeepMind's Gemini Flash 2.0 pushing multimodal boundaries to Google's Willow quantum chip solving tasks in minutes, the AI landscape is evolving faster than ever. Also, OpenAI's new video input features and Microsoft’s Phi-4 model prove that innovation is accelerating at a breakneck pace. Get ready for 2025, because the AI market is booming with more groundbreaking technologies! The future is happening now!

1 次回应

Marco van Hurne

2 个月

Welcome friends! Always happy to see your faces here again, especially this time of year, when it’s dark at four, this feels like a warm homecoming!

3 次回应

查看更多评论

要查看或添加评论，请登录

Marco van Hurne的更多文章

Bill Gates just delivered the most brutal takedown of AI optimism ever

2025年3月1日

Bill Gates just delivered the most brutal takedown of AI optimism ever

People, before you start readin’, pour yourself a drink. You’re going to need it.

2 条评论
The AI surveillance economy. Your data fuels a $200 Billion industry

2025年2月28日

The AI surveillance economy. Your data fuels a $200 Billion industry

Let’s get something straight. You don’t own your data.

1 条评论
Is Microsoft’s Satya Nadella right about AI’s lack of impact?

2025年2月27日

Is Microsoft’s Satya Nadella right about AI’s lack of impact?

Satya Nadella has finally had enough. After all the AI-driven hype, he is calling out the emperor’s new clothes, and is…
The AI hype machine strikes again. This time it’s deep research

2025年2月26日

The AI hype machine strikes again. This time it’s deep research

They said it would replace McKinsey. They said it was “PhD-level smart”, a so-called PhD killer.

4 条评论
Sam Altman is the new Bond super villain

2025年2月25日

Sam Altman is the new Bond super villain

You know what makes a great villain? It is not the over-the-top monologues, nor the black capes, or the cat-stroking ;)…

6 条评论
The great AI-con. You’re talking to a parrot

2025年2月24日

The great AI-con. You’re talking to a parrot

You’ve done it, dear intellectually superior friend! You have fallen for it. The great AI-con.

4 条评论
Teaching killbots to feel bad about killing you

2025年2月23日

Teaching killbots to feel bad about killing you

People, today is about Raytheon, Lockheed, and the Pentagon’s latest exercise in theatrical bullshit. What happened?…

4 条评论
We’re collectively downplaying the AIpocalypse

2025年2月22日

We’re collectively downplaying the AIpocalypse

G’day, you masochistic connoisseur of bleak humor! Let’s crank the sarcasm button up to eleven and double this…

11 条评论
AIs endgame. Manipulate, exploit, repeat

2025年2月21日

AIs endgame. Manipulate, exploit, repeat

AI has officially graduated from your friendly assistant to a professional liar with a side hustle in sabotage. In case…

6 条评论
MIT pushes decentralized AI to break Big Tech’s hold

2025年2月20日

MIT pushes decentralized AI to break Big Tech’s hold

Some bright minds at MIT’s Media Lab, which is the birthplace of many grand ideas and even grander failures, have…

14 条评论

See all articles

SundAI, your weekly overdose of artificial intelligence news: week 51

Marco van Hurne

AI & ML advisory | Author of The Machine Learning Book of Knowledge | Building AI skills & organizations | Data Science | Data Governance | AI Compliance Officer | AI Governance

Welcome back to SundAI !

TL;DR for the ones with attention deficit issues

More after the commercial brake:

Lalalala, blablablaaaa hmm hhmm gave to me, two blablablabla, three blablablibla.....and a pear treeee?

The AI job market is through the roof

Gemini Flash 2.0 and Veo 2: DeepMind’s answer to OpenAI

OpenAI keeps the hype alive with ChatGPT video features

Microsoft’s welterweight belt model punches above its division

领英推荐

Apple’s Siri gets an umbilical to ChatGPT

Google spits qbits in the face of researchers

Grok Is now free for all X abusers

SoftBank’s $100 billion investment in …. AI

The hottest AI News you weren’t looking for

Why should you even care about all these lame developments?

Reads/vids to pretend you’re working, while secretly dozing off

Repositories, tools, copy .

Research papers of the week

Quicky links

To keep you doomscrolling ??

TechTonic Shifts

2,557 位关注者

Marco van Hurne的更多文章

社区洞察

其他会员也浏览了

Demystifying DeepSeek Episode of circa January 27th 2025: Calm Inquiry into Opportunity for Advancement

Deepfake Detection Report: Taylor Swift

The Human-AI Fusion: My Year as a Cognitive Cyborg

The missing piece for Artificial Human Intelligence [1/6]

AI's Quantum Leap: From Party Trick to a Powerful Brain in 18 months

Hallucinating our way into the new world!

Sapiognosis as the New Philosophy of AI

Overcoming the Fear of AI: A Journey to Embrace the Future ?

Wisdom Over "AI-ge": The Role of Humans in the Age of Super Algorithms

Tenedos #005: Can Your Computer Know How You Feel? The Answer is Emotion AI

Welcome back to SundAI !

TL;DR for the ones with attention deficit issues

More after the commercial brake:

Lalalala, blablablaaaa hmm hhmm gave to me, two blablablabla, three blablablibla.....and a pear treeee?

The AI job market is through the roof

Gemini Flash 2.0 and Veo 2: DeepMind’s answer to OpenAI

OpenAI keeps the hype alive with ChatGPT video features

Microsoft’s welterweight belt model punches above its division

领英推荐

Apple’s Siri gets an umbilical to ChatGPT

Google spits qbits in the face of researchers

Grok Is now free for all X abusers

SoftBank’s $100 billion investment in …. AI

The hottest AI News you weren’t looking for

Why should you even care about all these lame developments?

Reads/vids to pretend you’re working, while secretly dozing off

Repositories, tools, copy *.*

Research papers of the week

Quicky links

To keep you doomscrolling ??

TechTonic Shifts

2,557 位关注者

Marco van Hurne的更多文章

Bill Gates just delivered the most brutal takedown of AI optimism ever

The AI surveillance economy. Your data fuels a $200 Billion industry

Is Microsoft’s Satya Nadella right about AI’s lack of impact?

The AI hype machine strikes again. This time it’s deep research

Sam Altman is the new Bond super villain

The great AI-con. You’re talking to a parrot

Teaching killbots to feel bad about killing you

We’re collectively downplaying the AIpocalypse

AIs endgame. Manipulate, exploit, repeat

MIT pushes decentralized AI to break Big Tech’s hold

社区洞察

其他会员也浏览了

Demystifying DeepSeek Episode of circa January 27th 2025: Calm Inquiry into Opportunity for Advancement

Deepfake Detection Report: Taylor Swift

The Human-AI Fusion: My Year as a Cognitive Cyborg

The missing piece for Artificial Human Intelligence [1/6]

AI's Quantum Leap: From Party Trick to a Powerful Brain in 18 months

Hallucinating our way into the new world!

Sapiognosis as the New Philosophy of AI

Overcoming the Fear of AI: A Journey to Embrace the Future ?

Wisdom Over "AI-ge": The Role of Humans in the Age of Super Algorithms

Tenedos #005: Can Your Computer Know How You Feel? The Answer is Emotion AI

Repositories, tools, copy .