AI News #69: Week Ending 01/24/2025 with Executive Summary, Top 63 Links, and Helpful Visuals

AI News #69: Week Ending 01/24/2025 with Executive Summary, Top 63 Links, and Helpful Visuals

Web version: https://ethanbholland.com/2025/01/28/ai-news-69-week-ending-01-24-2025-with-executive-summary-top-63-links-and-helpful-visuals/?

About This Week’s Covers

This week's cover represents a dramatic shift in power.? For years, closed frontier models like GPT and Claude have comfortably basked in the spotlight.? This week the world noticed that open-source models like DeepSeek are sneaking up from behind.? New versions of Grok and Llama are coming.? There are no moats.? Open-source is relentless.? Image created with Ideogram, upscaled with Magnific, and edited in Photoshop.

The rest of the covers were created using Claude 3.5 and the Ideogram API, with the theme 'Grim Reaper + category name.' Six of the better ones are below:

This Week’s Executive Summaries

There are three giant stories in this week’s edition. ?The first is the announcement of Stargate, a $500 billion project? between OpenAI, Japanese investment firm SoftBank, Oracle, and the US government. The second is a pretty cool evolution of the GPT tool that allows ChatGPT to use computers.? The third is an inexpensive yet devastatingly powerful model from China called DeepSeek.

Each one of these merits a completely separate newsletter, so I hope you will take your time and read these carefully!

First, I went back and gathered all the headlines I could remember from last year that connect with Stargate and DeepSeek. It’s pretty neat to look at these headlines with 20/20 hindsight.? SoftBank announced they want to go into the chip business back in February, and DeepSeek made its first big splash in September.

A timeline of headlines leading up to DeepSeek and Stargate: https://ethanbholland.com/2025/01/22/48-ai-headlines-leading-up-to-stargate/??

1: Stargate

OpenAI Launches $500 Billion 'Stargate Project' to Build Massive AI Infrastructure

OpenAI and tech giants are joining forces in a historic $500 billion initiative to build new AI computing infrastructure across the United States. The Stargate Project, led by OpenAI and SoftBank, will immediately deploy $100 billion to construct AI computing campuses, starting in Texas. The project brings together tech companies including Microsoft, NVIDIA, Oracle, and Arm (a British semiconductor and software design firm), with SoftBank's Masayoshi Son as chairman.? The project plans to create hundreds of thousands of American jobs, and aims to strengthen U.S. leadership in artificial intelligence while supporting national security interests. The initiative expands on OpenAI's existing partnerships, particularly its long-standing collaboration with NVIDIA dating back to 2016 and its ongoing work with Microsoft's Azure platform.

Openai | openai | gdb?

Stargate Project will invest $500B over the next 4 years - that's ~0.4% of US GDP over that period.

For comparison, the inflation-adjusted dollars spent on other large undertakings:

? Interstate Highway System: ~$650B

? Apollo Program: ~$280B

? Manhattan Project: ~$35B

tanayj

Masayoshi Son: "Mr. President, last month I came to celebrate your winning and promised $100B. And you told me go for $200B. Now I came back with $500B. This is because as you say, this is the beginning of the Golden Age. We wouldn't have decided this unless you won."?

AutismCapital?

2: OpenAI Operator??

Before we dive into OpenAI’s Operator announcement, I want to note that Anthropic introduced computer-use capabilities in November.? They also released a standard, the Model Context Protocol, which defines an API structure for AI agents to communicate with web data.? Right now, we're seeing a low-key demo of multimodality (computer vision and mimicry) that demonstrates AI's ability to see and act.? Long term, this will completely change interface design.

Here’s Anthropic’s Model Context Protocol: https://modelcontextprotocol.io/introduction????

For a refresher, here’s a collection of headlines and demos from Anthropic’s November announcements: https://ethanbholland.com/2024/11/29/anthropic-ai-news-week-ending-11-29-2024/

As luck would have it, a new course on using Anthropic’s system came out this week:

https://x.com/DeepLearningAI/status/1882103472146862098?

OpenAI Unveils Operator: AI Assistant That Browses the Web Like a Human -

OpenAI has launched Operator, an AI agent that is basically a quick multimodality demonstration, “seeing” the browser elements and clicking on buttons and mimicking the way people browse the web.? Operator can navigate web browsers to complete (tenuously, under supervision) tasks like ordering groceries, booking travel, and filling out forms. Powered by their new Computer-Using Agent (CUA) model and GPT-4's vision capabilities, Operator can see and interact with websites like a person - clicking, typing, and scrolling through web pages. Initially only available to pro users in the USA for $200/month, OpenAI is partnering with major companies like DoorDash, Instacart, and Uber. While the system can handle many tasks independently, it's designed to hand control back to users when encountering sensitive actions like payments or login credentials.??

Openai | SullyOmarr My take on this is that Operator and other tools are softening the beaches for the public to start to grasp that AI can see what it’s doing.? It’s no longer a chat bot.? It’s a “see and hear and do” bot.? Long term, there won’t be a need to use the web like people (see Anthropic’s Model Context Protocol).

It’s worth noting that NVIDIA is making humanoid robots their priority because the world is designed for human use cases (driving, sitting, standing, using hands).? OpenAI’s computer use launch takes advantage of the web’s design for people to see and click on.? I find that fascinating.

3: DeepSeek

The biggest story of the week was DeepSeek R-1.? There was a clear trajectory of DeepSeek over the past few months, if you look back at the timeline of headlines leading up to both DeepSeek and Stargate: https://ethanbholland.com/2025/01/22/48-ai-headlines-leading-up-to-stargate???

Chinese AI Startup DeepSeek Challenges OpenAI with Powerful Open-Source Model

DeepSeek released an open-source AI model that matches the performance of OpenAI's latest systems at a fraction of the cost, marking a significant shift in AI accessibility (to put it mildly). The model, DeepSeek-R1, was developed using innovative training methods that overcame China's chip restrictions, costing just $5.6 million to train compared to competitors' hundreds of millions.

Key developments that make this significant:

  • The model is fully open-source under an MIT license, allowing anyone to use, modify, or commercialize it
  • Pricing is dramatically lower than closed competitors: just $2.19 per million output tokens (compared to $20+ for GPT-4!!!)
  • Performance matches OpenAI's models on complex tasks like mathematics and coding
  • Uses a clever architecture that needs only 37B active parameters while having access to 671B total parameters
  • Achieved through pure reinforcement learning, challenging conventional wisdom about AI training

More than anything, DeepSeek demonstrates that open-source AI is consistently only 6 months behind proprietary models, raising questions about the sustainability of high-priced, closed AI systems.? “Companies charging premium prices for closed models may need to rethink their strategy”, is one subtle way to say it.

Deepseek_ai | ollama | DeepLearningAI

Top DeepSeek Reactions Worth Reading

DeepSeek's open frontier significance: "The release of DeepSeek-R1 demonstrates that, for better or worse, any attempt to restrict access to AI by governments is unlikely to work."

Link: https://x.com/emollick/status/1881405036926001580

Cipher text challenge: "Deepseek R1 thinks for around 75 seconds and successfully solves this cipher text problem from openai's o1 blog post."

Link: https://x.com/mrsiipa/status/1881330071874813963?

"I asked #R1 to visually explain to me the Pythagorean theorem. This was done in one shot with no errors in less than 30 seconds. Wrap it up, its over: #DeepSeek #R1?

https://x.com/christiancooper/status/1881335734256492605?

Raw chain of thought: "The raw chain of thought from DeepSeek is fascinating, really reads like a human thinking out loud."

Link: https://x.com/emollick/status/1881423029160575474

"No matter how much you fight it, I find that the visible chain-of-thought from DeepSeek makes it nearly impossible to avoid anthropomorphizing the thing. The visible first-person "thinking" makes you feel like you are reading a diary of a somewhat tortured soul who wants to help?

https://x.com/emollick/status/1881904723026210985?

Math task nailed: "DeepSeek R1 Distill Qwen 7B (in 4-bit) nailed the first hard math question I asked it. Thought for ~3200 tokens in about 35 seconds on M4 Max."

Link: https://x.com/awnihannun/status/1881386796266946743

The Rest of the Summaries

Goldman Sachs' New AI Helper Could Replace Human Tasks for 10,000 Bankers

Goldman Sachs is rolling out a new AI assistant to 10,000 employees, with plans to expand it company-wide in 2024. The tool, called GS AI assistant, helps with basic tasks like email writing and code translation, but aims to eventually function like a seasoned Goldman employee. Built using technology from OpenAI, Google, and Meta, the assistant represents a broader trend of major banks embracing AI - JPMorgan and Morgan Stanley have launched similar tools, reaching over 240,000 employees combined. While some experts predict AI could eliminate up to 200,000 banking jobs in the next few years, Goldman's CIO Marco Argenti emphasizes that human workers will remain crucial in training and directing these AI systems.

Cnbc?

AI Leaders Warn of Rapid Intelligence Advances, Point to Surprising Breakthroughs

Top AI executives and researchers are signaling that artificial intelligence may be advancing faster than expected.? Anthropic's CEO suggests AI systems could match or exceed human capabilities as soon as 2027, while leaked test results show OpenAI solving complex problems years ahead of previous estimates. Industry experts emphasize that while these predictions aren't certain, the public and policymakers should take the possibility of rapid AI advancement seriously.

Anthropic CEO Says AI Could Surpass Human Intelligence by 2027

https://www.wsj.com/livecoverage/stock-market-today-dow-sp500-nasdaq-live-01-21-2025/card/anthropic-ceo-says-ai-could-surpass-human-intelligence-by-2027-9tka9tjLKLalkXX8IgKA

"This prediction (AGI within next couple years) is a common timeline for insiders. There are reasons to not believe them, but I think people are not taking the possibility seriously enough that they may be directionality correct."??

https://x.com/emollick/status/1881779923289072060?

"leaked benchmark: o3 pro solved problems we thought were 5 years away. sam's team is trying to figure out how it did it. something unprecedented is happening."?

https://x.com/iruletheworldmo/status/1880760849259999363

AI Visuals and Charts: Week Ending 01/24/2025

These Robot Demonstrations Will Freak You Out

MUST SEE VIDEO: "??Hottest on the Ice | #DEEPRobotics #Lynx Snow Parkour, Stream Crossing #robotdog #robotics #robots #ai #tech?

https://x.com/DeepRobotics_CN/status/1882022829727859113?

"Let's reverse engineer this demo. You need 3 things: (1) robust hardware and motor designs that treat simulation as first-class citizen; (2) a human motion capture ("mocap") dataset, such as those for film and gaming characters; (3) massively parallel RL training in?

https://x.com/DrJimFan/status/1879922307923411081

Google VEO Text-To-Video Examples

"How does veo 2 pull off three nervous women holding knives on the back of a giant caterpillar running through a deserted city looking determined as a tank rolls towards them? If I imagine a thing, I can generate something close. It isn't a replacement for film, it is a new thing?

https://x.com/emollick/status/1881229882598011029

"The new ability of AI video creators to add real people and products to scenes with just an image is likely to increase the utility (& certainly misuse) of AI video. Here I made Shakespeare at a cafe and the Girl with the Pearl Earring piloting a mech (just as Vermeer intended)?

https://x.com/emollick/status/1882117645077745816

Other Visuals Worth Seeing This Week - Don’t quit now!

EMO2: End-Effector Guided Audio-Driven Avatar Video Generation https://humanaigc.github.io/emote-portrait-alive-2/

"This is creepy.. It’s an AI tool called GeoSpy that can geolocate photos based on features in the image?

https://x.com/Mr_AllenT/status/1881695789216612663?

"I am training a Diffusion Feature Extractor with lpips outputs as targets now. It is still cooking, but I pulled an early version out and finetuned Flex.1-alpha for ~2k steps with it. It is really cleaning up the features, especially text. I am super excited about this.?

https://x.com/ostrisai/status/1882447889882034629?

"The most hilarious application of perception AI has to be creating a Nintendo Wii Tennis rendition of a tennis game to bypass needing streaming rights lol?

https://x.com/bilawalsidhu/status/1880110288114184407?

Top 63 Links of The Week - Organized by Category?

Agents and Copilots

Agents Webinar Jan 30 Kore.ai - AI for Process

https://info.kore.ai/ai-for-process-virtual-events-2025-tldr?

"Most advanced Agentic Researcher by Google. It can draft a plan, search the web, analyze results, and create a well-researched report in under 2 minutes. It's a team of AI Agents that works like a human researcher.?

https://x.com/unwind_ai_/status/1879004517955776652

"Introducing Perplexity Assistant. Assistant uses reasoning, search, and apps to help with daily tasks ranging from simple questions to multi-app actions. You can book dinner, find a forgotten song, call a ride, draft emails, set reminders, and more. Available on Play Store.?

https://x.com/perplexity_ai/status/1882466239123255686

"???? What does Caterpillar construction equipment have to do with AI agents? More than you'd think! @mmitchell_ai explains how the team defined agent capabilities.?

https://x.com/fdaudens/status/1880360472630702317

Ph.D.-level AI super-agent breakthrough expected very soon

https://www.axios.com/2025/01/19/ai-superagent-openai-meta

Anthropic

Introducing Citations on the Anthropic API \ Anthropic

https://www.anthropic.com/news/introducing-citations-api

"We've rolled out Citations in the Anthropic API. Citations allows Claude to ground its answers in user-provided information and provide precise references to the sentences and passages used in its responses. Here's how it works:?

https://x.com/alexalbert__/status/1882481265414377919

"Our first short course with @AnthropicAI! Building Towards Computer Use with Anthropic. This teaches you to build an LLM-based agent that uses a computer interface by generating mouse clicks and keystrokes. Computer Use is an important, emerging capability for LLMs that will let?

https://x.com/AndrewYNg/status/1882125891821822398

Augmented and Virtual Reality (AR/VR)

"Physical AI's progress depends on the development of World Foundation Models (WFMs) – AI systems that simulate real-world environments from text, image, or video inputs. Just two weeks ago, @NVIDIA launched and open-sourced Cosmos WFMs platform. Here's how it works?? The?

https://x.com/TheTuringPost/status/1882579105448882440

"Introducing #NVIDIACosmos, the world foundation model platform built to advance physical #AI. Learn how, through integrations with @NVIDIAOmniverse, developers can create physics-based, geospatially accurate scenarios. Watch the #CES2025 demo ???

https://x.com/nvidia/status/1880034245466259739

"NVIDIA’s Jensen Huang has declared “Physical AI” the next big revolution. What is Physical AI? Think robotics, AR glasses, planetary-scale 3D simulations, and beyond — an entirely new wave of tech that fuses digital intelligence with the real world. Let's break down NVIDIA’s?

https://x.com/bilawalsidhu/status/1880303625290842511

Business and Enterprise

Mira Murati’s AI Startup Makes First Hires, Including Former OpenAI Executive | WIRED

https://www.wired.com/story/mira-murati-startup-hire-staff/??

Chips, Hardware, and Infrastructure

"NVIDIA’s Jensen Huang has declared “Physical AI” the next big revolution. What is Physical AI? Think robotics, AR glasses, planetary-scale 3D simulations, and beyond — an entirely new wave of tech that fuses digital intelligence with the real world. Let's break down NVIDIA’s?

https://x.com/bilawalsidhu/status/1880303625290842511

"NEW VIDEO: Unpacking NVIDIA’s vision for Physical AI — where robotics, AR/VR, and real-world data converge into a $100T opportunity. From preventing disasters to reimagining cities, here’s why it matters (Link in comment below).?

https://x.com/bilawalsidhu/status/1879921654874136708

Ethics/Legal/Security

"The thing about open models is that they can, as far as we know, always be jailbroken in a way that gets around their guardrails, as once they are in the wild there are lots of techniques. This applies equally to political censorship as it does to preventing harmful use of AI.?

https://x.com/emollick/status/1881577479460274631

Sam Altman on X: "thank you to the external safety researchers who tested o3-mini. we have now finalized a version and are beginning the release process; planning to ship in ~a couple of weeks. also, we heard the feedback: will launch api and chatgpt at the same time! (it's very good.)" / X - https://x.com/sama/status/1880356297985638649?

"Geoffrey Hinton warns about the dangers of releasing AI Model weights releasing the weights of large AI models is dangerous, similar to making fissile material available for making bombs. once these models are released, bad actors can use them for harmful purposes but it's too?

https://x.com/slow_developer/status/1880627011376312732?

Google

"Most advanced Agentic Researcher by Google. It can draft a plan, search the web, analyze results, and create a well-researched report in under 2 minutes. It's a team of AI Agents that works like a human researcher.?

https://x.com/unwind_ai_/status/1879004517955776652

"We are rolling out a new Gemini 2.0 Flash Thinking update: - Exp-01-21 variant in AI Studio and API for free - 1 million token context window - Native code execution support - Longer output token generation - Less frequent model contradictions Try it?

https://x.com/OfficialLoganK/status/1881844578069999809

"The Graduate-Level Google-Proof Q&A test (GPQA) is a series of multiple-choice problems that internet access doesn't help PhDs with access to the internet get 34% right on this test outside their specialty, 81% inside their specialty I matched model release dates to scores 1/?

https://x.com/emollick/status/1880041714683113754

"Breaking news from Text-to-Image Arena! ???? @GoogleDeepMind’s Imagen 3 debuts at #1, surpassing Recraft-v3 with a remarkable +70-point lead! Congrats to the Google Imagen team for setting a new bar! Try the best text2image at LMArena and cast your vote! More analysis???

https://x.com/lmarena_ai/status/1882164189739073990

Imagery

"Breaking news from Text-to-Image Arena! ???? @GoogleDeepMind’s Imagen 3 debuts at #1, surpassing Recraft-v3 with a remarkable +70-point lead! Congrats to the Google Imagen team for setting a new bar! Try the best text2image at LMArena and cast your vote! More analysis???

https://x.com/lmarena_ai/status/1882164189739073990

Locally Run Models

(2) Why o3-mini had to be free: the coming DeepSeek R1, 2.0 Flash, and Sky-T1 Price War

https://www.latent.space/p/reasoning-price-war?

Multimodality

"Alibaba presents: VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Open-sources VideoLLaMA 3, the SotA open-source model on both image and video understanding benchmarks?

https://x.com/arankomatsuzaki/status/1882270342649126947

[2501.10098v1] landmarker: a Toolkit for Anatomical Landmark Localization in 2D/3D Images

https://arxiv.org/abs/2501.10098v1

OpenAI

Sam Altman on X: "thank you to the external safety researchers who tested o3-mini. we have now finalized a version and are beginning the release process; planning to ship in ~a couple of weeks. also, we heard the feedback: will launch api and chatgpt at the same time! (it's very good.)" / X - https://x.com/sama/status/1880356297985638649

Open Source/DeepSeek

DeepSeek on X: "?? DeepSeek-R1 is here! ? Performance on par with OpenAI-o1 ?? Fully open-source model & technical report ?? MIT licensed: Distill & commercialize freely! ?? Website & API are live now! Try DeepThink at https://t.co/v1TFy7LHNy today! ?? 1/n https://t.co/7BlpWAPu6y" / X - https://x.com/deepseek_ai/status/1881318130334814301

DeepSeek R1's recipe: (2) DeepSeek R1's recipe to replicate o1 and the future of reasoning LMs

Link: https://www.interconnects.ai/p/deepseek-r1-recipe-for-o1?

DeepSeek combines RL with multi-stage training: "Reinforcement Learning is all you need! @deepseek_ai R1, an open model that rivals @OpenAI o1 and other models on complex reasoning tasks, just got released."

Link: https://x.com/_philschmid/status/1881420703721009192?

PSA from Mark Lord: "It takes <2 minutes to set up R1 as a free+offline coding assistant ??♀? Big shoutout to @lmstudio and @continuedev!"

Link: https://x.com/priontific/status/1881668130470285379?

License update: "?? License Update! ?? DeepSeek-R1 is now MIT licensed for clear open access ?? Open for the community to leverage model weights & outputs."

Link: https://x.com/deepseek_ai/status/1881318138937233664

Training pipeline visualization: "Here's my attempt at visualizing the training pipeline for DeepSeek-R1(-Zero) and the distillation to smaller models."

Link: https://x.com/SirrahChan/status/1881488738473357753

DeepSeek on HuggingChat: "DeepSeek R1 has landed on HuggingChat!"

Link: https://x.com/fdaudens/status/1881737288066961567

Most researchers are shocked: "Most AI researchers I talk to have been a bit shocked by DeepSeek-R1 and its performance."

Link: https://x.com/AlexGDimakis/status/1881511481164079507

Epoch AI Article: How has DeepSeek improved the Transformer architecture?

Link: https://epoch.ai/gradient-updates/how-has-deepseek-improved-the-transformer-architecture

"No matter how much you fight it, I find that the visible chain-of-thought from DeepSeek makes it nearly impossible to avoid anthropomorphizing the thing. The visible first-person "thinking" makes you feel like you are reading a diary of a somewhat tortured soul who wants to help?

https://x.com/emollick/status/1881904723026210985

"The release of DeepSeek-R1 demonstrates that, for better or worse, any attempt to restrict access to AI by governments is unlikely to work. You can get an open frontier model on a USB stick, and the methods outlined by DeepSeek suggest pathways forward for other open models, too." / X

https://x.com/emollick/status/1881405036926001580

"I asked #R1 to visually explain to me the Pythagorean theorem. This was done in one shot with no errors in less than 30 seconds. Wrap it up, its over: #DeepSeek #R1?

https://x.com/christiancooper/status/1881335734256492605

"DeepSeek is a side project ???

https://x.com/hardmaru/status/1882698763988545808

"DeepSeek's first-generation reasoning models are achieving performance comparable to OpenAI's o1 across math, code, and reasoning tasks! Give it a try! ?? 7B distilled: ollama run deepseek-r1:7b More distilled sizes are available. ???

https://x.com/ollama/status/1881427522002506009

"That a second paper dropped with tons of RL flywheel secrets and multimodal o1-style reasoning is not on my bingo card today. Kimi's (another startup) and DeepSeek's papers remarkably converged on similar findings: &gt; No need for complex tree search like MCTS. Just linearize?

https://x.com/DrJimFan/status/1881382618627019050

"DeepSeek-V3, the company's latest open LLM, surpasses Llama 3.1 405B and GPT-4o on key benchmarks, especially in coding and math tasks. Using a mixture-of-experts architecture with 671 billion parameters, only 37 billion are active at once, DeepSeek V3 was trained at a low cost" / X

https://x.com/DeepLearningAI/status/1880087643964199260?

"????Announcing @MistralAI new model: Codestral 25.01 - new SOTA coding model, #1 on LMSYS! - Lightweight, fast, and proficient in over 80 programming languages, - Optimized for low-latency, high-frequency usecases - 2x faster than the previous version - Supports tasks such as?

https://x.com/sophiamyang/status/1878902888434479204

"?? Introducing Kimi k1.5 --- an o1-level multi-modal model -Sota short-CoT performance, outperforming GPT-4o and Claude Sonnet 3.5 on ??AIME, ??MATH-500, ?? LiveCodeBench by a large margin (up to +550%) -Long-CoT performance matches o1 across multiple modalities (??MathVista,?

https://x.com/Kimi_ai_/status/1881332472748851259

Buzzy French AI startup Mistral isn't for sale and plans to IPO, its CEO says

https://finance.yahoo.com/news/buzzy-french-ai-startup-mistral-133915078.html

Perplexity

andy chung on X: "Today I’m excited to share that @read_cv is joining the team at @perplexity_ai in their mission to make the world's knowledge more accessible to everyone. This is incredibly bittersweet for us, as the start of this new chapter will mark the end of our time with @read_cv. It has https://t.co/6CUinOEGsi" / X - https://x.com/_andychung/status/1880332676013650006

Publishing

"This essay from John Micklethwait is one of the most thoughtful texts I've read recently about the future of journalism. Nuanced, grounded in real newsroom experience.?

https://x.com/fdaudens/status/1879732059272331458?

Robotics and Embodiment

Watch Nvidia’s Huang Sees AI Robots Boosting Manufacturing - Bloomberg

https://www.bloomberg.com/news/videos/2025-01-07/nvidia-s-huang-sees-ai-robots-boosting-manufacturing

"Physical AI's progress depends on the development of World Foundation Models (WFMs) – AI systems that simulate real-world environments from text, image, or video inputs. Just two weeks ago, @NVIDIA launched and open-sourced Cosmos WFMs platform. Here's how it works?? The?

https://x.com/TheTuringPost/status/1882579105448882440

"Introducing #NVIDIACosmos, the world foundation model platform built to advance physical #AI. Learn how, through integrations with @NVIDIAOmniverse, developers can create physics-based, geospatially accurate scenarios. Watch the #CES2025 demo ???

https://x.com/nvidia/status/1880034245466259739

"NVIDIA’s Jensen Huang has declared “Physical AI” the next big revolution. What is Physical AI? Think robotics, AR glasses, planetary-scale 3D simulations, and beyond — an entirely new wave of tech that fuses digital intelligence with the real world. Let's break down NVIDIA’s?

https://x.com/bilawalsidhu/status/1880303625290842511

"Let's reverse engineer this demo. You need 3 things: (1) robust hardware and motor designs that treat simulation as first-class citizen; (2) a human motion capture ("mocap") dataset, such as those for film and gaming characters; (3) massively parallel RL training in?

https://x.com/DrJimFan/status/1879922307923411081

Google is building a ‘world modeling’ AI team for games and robots - The Verge

https://www.theverge.com/2025/1/7/24338053/google-deepmind-world-modeling-ai-team-gaming-robot-training

"Projects like OpenAI’s Operator are to the digital world as Humanoid robots are to the physical world. One general setting (monitor keyboard and mouse, or human body) that can in principle gradually perform arbitrarily general tasks, via an I/O interface originally designed for" / X

https://x.com/karpathy/status/1882544526033924438

Unitree on X: "Unitree G1 Bionic: Agile Upgrade ?? Unitree rolls out frequent updates nearly every month. This time, we present to you the smoothest walking and humanoid running in the world. We hope you like it. #Unitree #AGI #EmbodiedAI #AI #Humanoid #Bipedal #WorldModel https://t.co/uM0DWJG5Ii" / X - https://x.com/UnitreeRobotics/status/1879864345615814923

Science and Medicine

"Your brain's next 5 seconds, predicted by AI Transformer predicts brain activity patterns 5 seconds into future using just 21 seconds of fMRI data Achieves 0.997 correlation using modified time-series Transformer architecture ----- ?? Original Problem: Predicting future?

https://x.com/rohanpaul_ai/status/1880184389218496770

Video News

"Video Depth Anything is out! ?? Real-time inference for arbitrarily long videos with temporal and spatial consistency. Built on the excellent Depth Anything v2 (for images), by "simply" replacing the head and adjusting the loss for temporal consistency. Videos from the project?

https://x.com/pcuenq/status/1881978412270829728

Luma Ray2

https://lumalabs.ai/ray?

"Luma Labs released Ray2, its next-gen AI for generating 10s videos with advanced motion quality and physics realism Ray2 understands complex object interactions, including water physics Now, the question is which lab will crack longer-length outputs?

https://x.com/adcock_brett/status/1881024718712480163

要查看或添加评论,请登录

Ethan Holland的更多文章

  • What Would You Do If You Weren't Afraid

    What Would You Do If You Weren't Afraid

    September 10, 2019 Since eighth grade, a person with a mental illness has stalked me. Persistently.

  • Can AI be creative?

    Can AI be creative?

    "When AI models regurgitate information in response to prompts we call them stochastic parrots; when humans do it we…

    3 条评论
  • The AI Future: Exploring the Adjacent Possible with Emerging AI Solutions

    The AI Future: Exploring the Adjacent Possible with Emerging AI Solutions

    By Ethan Holland Note: I wrote this in November for a Thanksgiving deadline, and it was somewhat terrifying to watch AI…

    2 条评论
  • AI News: Week Ending 12/22/2023 - With Executive Summary and Top Links

    AI News: Week Ending 12/22/2023 - With Executive Summary and Top Links

    This week's cover theme (see image above) is “segmentation”. Segmentation is the ability for a computer to tell things…

    2 条评论
  • AI News: Week Ending 12/15/2023

    AI News: Week Ending 12/15/2023

    Executive Summary Open-Source AI: The top story is the release of Mistral, an AI model that is free and anyone can…

    4 条评论
  • AI News: Week Ending 12/08/2023

    AI News: Week Ending 12/08/2023

    Executive Summary Google Gemini: The top story is the release of Google Gemini, which promises to compete with, if not…

    2 条评论
  • AI News: Week Ending 12/01/2023

    AI News: Week Ending 12/01/2023

    Executive Summary Another week with a lot of big news, but this week’s theme is video, video, video. Each one of these…

    2 条评论
  • AI News: Week Ending 11/24/2023

    AI News: Week Ending 11/24/2023

    Executive Summary This week had a lot of big news (beyond Sam Altman returning as CEO of OpenAI). Each one of these…

    1 条评论
  • AI News: Week Ending 11/17/2023

    AI News: Week Ending 11/17/2023

    Executive Summary This week had a lot of big news (beyond Sam Altman getting fired). Each one of these topics stands…

    1 条评论
  • Russian Dall-E Experiment: MidJourney GPTs v. Plain Dall-E

    Russian Dall-E Experiment: MidJourney GPTs v. Plain Dall-E

    TLDR: Are there any good GPTs? Dall-E is stronger out of the box at complex imagery than GPTs + MidJourney Premise: I…

    3 条评论

社区洞察

其他会员也浏览了