Emerging AI: Roundup for January and February 2025

Emerging AI: Roundup for January and February 2025

For our first AI roundup for 2025, we take a look at the impact of AI in the workplace, what the various models can do now, new systems for research and reasoning, new benchmarks, and developments in the global AI race from January and February.

In a time when Chinese company DeepSeek is making waves for its innovative approach to AI—and America’s major players are doubling down on investments—it may be the applications of AI, rather than big new models, that garner the most attention.

Because, even with the status quo, we’ve still got a lot of figuring out to do, as Josh Woodward, head of Google Labs, told Wired’s Steven Levy in February:

“If you just hit pause on everything, we probably have five to 10 years' worth of capabilities we could turn into new products.”

For one example, we open with how AI technology is helping people with motor neuron diseases communicate.?

As profiled in the MIT Technology Review, startup ElevenLabs has found a way to channel their work on AI audio solutions for film, TV, and podcasts into helping people who can no longer speak use their own voices again.

In partnership with nonprofits like Bridging Voice, they’ve enabled access to the tech for free for people with conditions like ALS. And while the technology to generate audio “voice” through typing has existed for years, these prior, banked voice systems required reading out some 1500 phrases (often no longer possible when the need is discovered) and then only provided a jerky, unnatural version of the voice.?

This new solution needs just one to 30 minutes of pre-recorded audio and provides far more lifelike output. It’s already enabled people like Jules Rodriguez to speak again, and in his own voice (he’s even used it to deliver stand-up comedy).?

And while not flawless (his wife jokes their arguments are very slow-paced), it’s a powerful example of a less-publicized way AI is transforming lives for the better.?

AI in Jobs and Employment Trends

Looking at the workplace during this roundup period, we see continued AI implementation happening but in uneven fashion, flourishing at some companies in certain areas (like customer service, data entry, development, design, and administrative tasks) while trailing in others.?

This accompanies similarly erratic shifts in the job market, with companies like Salesforce and Meta reducing some roles on the basis of AI use, just as others are added. In the case of Salesforce, they are both hiring salespeople to sell AI and encouraging laid off employees to transfer to new roles internally.

The World Economic Forum (WEF) Future of Jobs Report 2025 from January shows that 41% of global companies plan to reduce workforces because of AI while 2/3 are actively hiring for other positions, especially in areas like AI and big data, networks and cybersecurity, and overall technological literacy.?And 70% of companies surveyed are actively hiring specialists to design or customize their own AI implementations.?

Overall, the report expects 78 million jobs to be created in net by 2030 (with 170 million total created as 92 million are eliminated).??

This accompanies research (such as from Microsoft and LinkedIn) showing that twice as many workers in the US are using AI on the job than did in 2023 and that more than half (52%) believe AI skills will help them at work.?

This lines up with numbers from McKinsey, which found that workers are increasingly adopting AI and often have a readiness exceeding leadership’s expectations.?

Workplace AI Readiness

Maybe more surprisingly, research published in the Journal of Marketing and profiled in The Conversation in January found that the less people knew about AI, the more receptive they were to its adoption. This phenomenon, which they termed “lower literacy–higher receptivity,” holds up across groups, settings, and countries.?

In nations with lower AI literacy, for example, people are far more enthusiastic and willing to embrace adoption, which also holds up with undergraduate students, who prove more likely to use AI in assignments the less they know about it.?

While this matches with data from some other technological revolutions, it may also be reflected in studies measuring co-intelligence, which find experts typically gain less from AI use than those less skilled and are also less willing to accept AI conclusions.?

Next-Gen AI Models and Innovations?

Combinations, implementations, and reasoning models are the fastest-growing space right now, but before diving into that, we look over the current model AI capabilities at the start of the new year

AI Model Capabilities

During this period, Google committed another billion dollars to OpenAI rival Anthropic (increasing its total input to $3 billion), but it was OpenAI, Oracle, and Softbank that made news for their Project Stargate AI ambitions, a four-year, $500 billion infrastructure project that also includes partnership with the US government, Nvidia, Arm, and Microsoft.??

Setting a Higher Bar for Benchmarks?

Days after OpenAI first previewed its o1 reasoning model, the nonprofit Center for AI Safety and startup Scale AI were seeking to amass new questions from experts in an attempt to level up current benchmarking systems.?

With most of the commonly used tests now passed by top-level AI systems with ease (potentially even trained on their sources), they worked to create a new benchmark to replace others like the MMLU.

It’s called Humanity’s Last Exam and is built from challenging, multi-modal questions across a wide variety of subjects.?

The goal is to target irregularities in AI models which can now solve PhD-level problems and diagnose complex illnesses while also failing simple arithmetic or hallucinating wildly.?

This exam has now been released, and you can see results here.?

Here’s how some of the top AI models are doing so far:

  • OpenAI o3 mini (high, not released): 14% accuracy

  • OpenAI o1: 8.8%

  • DeepSeek R1: 8.6%

  • Google Gemini Thinking: 7.2%

  • Claude 3.5 Sonnet: 4.8%

  • Grok 2: 3.9%

  • OpenAI GPT-4o: 3.1%

These stand in striking contrast to other benchmarks like the GPQA or MMLU, where models like o1 are testing in the 90th percentile.?

Where Bigger is Not Better?

Last year the dominance of increasingly large AI models appeared to hit a plateau. And while more power and scale still improve capacity and performance, we’re no longer seeing the dramatic leaps as in the move from GPT-3 to GPT-4.?

As profiled by MIT in their 10 Breakthrough Technologies 2025, small AI language models pose an alternative.?

By enabling AI systems to do more with less power, models like OpenAI’s GPT-4o mini, Anthropic’s Claude 3 Haiku, Google’s Gemini Nano, and Microsoft Phi are catching up fast (especially for more limited use cases), with far fewer parameters.?

We’ve written in prior installments how small models are being used to help steer and correct larger models for uses like error-checking and fine tuning, and a similar application was made by DeepSeek to achieve its impressive performance with drastically reduced training costs.?

Their Mixture-of-Experts architecture (MoE) employed smaller sub-models as experts in certain fields, coordinating with oversight models. The traditional approach of using a single, massive neural network with a multitude of parameters active at once, trained on everything, consumes vastly more power.

Smaller models can reduce operational demands, training costs, and exposure, while still performing very well with a much smaller number of parameters in more limited areas of expertise.

Rolling It All Together?

In the same way, the big AI companies are seeking new behavior and broader functionality with their own compound AI systems.

Research capacities have rolled out in the form of OpenAI and Google’s Deep Research paid offerings, while Perplexity has released its own for free (they claim it has managed an impressive 21.1% on Humanity’s Last Exam).?

There are also systems aimed at autonomy and combined behaviors, like OpenAI's Operator, which performs its own web-based tasks. Operator utilizes their Computer-Using Agent (CUA) model along with GPT-4o's vision capabilities and advanced reasoning to navigate complex websites, automate repetitive tasks, and interact with web elements enabling it to fill out forms, book travel, and create multi-dimensional content.

Other examples include Google DeepMind's Google Project Mariner AI, leveraging Gemini 2.0 to interact with more complex web interfaces and automate tasks within the browser, and Anthropic’s Claude 3.5 capacity (from late 2024) to use computers more fully, as a person does.?

This fusion of various AI models and technologies aims for more practical application, at a time when reports point to OpenAI combining their o-series with GPT-4 level chatbot capacity in the search for dynamic intelligence. Some believe these GPT-4.5 AGI advancements have led to the company’s more aggressive timeline.

And when all else fails, there are combinations like the one offered by search service Pearl, which promises to give you a human being if AI responses fail.?

Updates on International Competition

We’ve recently written on the AI Paris Summit 2025 and changes in regulation as the international competition heats up, so we won’t repeat it here.?

Nevertheless, the success of open-source models released by Chinese AI company DeepSeek stunned the world with their capacity despite using fewer, inferior chips and a fraction of the training cost, all at a time when US AI investments are surging.?

Along with the MoE technique (see above), DeepSeek innovations included clever mathematical manipulations (such as using far less specificity in areas that didn’t noticeably reduce accuracy) and highly effective coding, allowing them to get more from less.

Their open-source models have inspired both smaller companies and nations that may be trailing in the AI race, though questions about data security, propaganda, and government access to data have led to restrictions on use in several countries already (South Korea, Australia, Italy, and Taiwan), as well as US states like New York and Texas. And bi-partisan legislation has been introduced in Congress to restrict use of DeepSeek products on all government devices.?

American plans to restrict access to top AI chips and models to certain countries began under the prior administration and appear to be continuing, even as companies like Apple are fighting against slumping sales in China and tariff concerns that could impact manufacturing.?

Apple announced a partnership with Chinese company Alibaba in this period in a bid to get their Apple Intelligence released into the region, with Chinese regulations preventing the use of their OpenAI-assisted technology domestically.?The hope is that this Apple Alibaba AI partnership will aid slumping iPhone sales at a time when their business is doing better in other areas.

Companies like Microsoft and OpenAI are also coveting sizable investments in AI from nations like the United Arab Emirates and Saudi Arabia, arguing that this funding could support development in China if they are closed off to it.?

PTP and the AI Workforce Transformation in 2025??

We covered changes in the workforce in 2025 above, including a surge in demand for AI talent at organizations across industries who are implementing their own AI solutions.?

With a need for experienced hands to couple with your own in-house talent and subject matter experts, it can be hard to find the right people in such a fast-moving market.?

Consider PTP for your AI talent needs. With over 27 years of experience in tech recruiting, and as an early adopter of AI, we have our own best-in-class ML/AI talent pipeline for whatever your needs may require.?

In Conclusion?

One smaller story from mid-February that may have large consequences for AI was a win in court by the Thomson Reuters company over (now defunct) Ross Intelligence.?

In a suit that echoes some of the many copyright battles being waged over AI training sources, permission, and compensation, the court found that “fair use,” or using copyrighted materials for such activities as education, transformation, or research, didn’t hold in this case for AI training.

While it remains to be seen if this will carry over to other cases, it nevertheless could feed a growing concern about the availability of sufficient data to continue training increasingly hungry AI models.??

This concludes our coverage of breaking AI news from January and February 2025!?

Expect our next bi-monthly update around the end of April, and if you need to catch up on any of 2024 roundups, take a look below:

References

2025: The Year of the AI App, Wired

AI Chatbots Are Ready to Talk to Customers. Sort of., The Wall Street Journal

Motor neuron diseases took their voices. AI is bringing them back., MIT Technology Review

OpenAI looks across US for sites to build its Trump-backed Stargate AI data centers, AP

When A.I. Passes This Test, Look Out, The New York Times

Accelerating scientific breakthroughs with an AI co-scientist, Google Research

Small language models: 10 Breakthrough Technologies 2025, MIT Technology Review

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding, arXiv:2412.10302 [cs.CV]

Knowing less about AI makes people more open to having it in their lives – new research, The Conversation

Apple's Alibaba AI deal is unlikely to be a silver bullet for its China woes, Business Insider

Thomson Reuters scores early win in AI copyright battles in the US, AP Business


- Doug McCord

(Staff Writer)

Check out other articles from PTP on AI?

Get the latest updates on recruiting trends, job market, and IT, and expert advice on hiring and job seeking at The PTP Report?


要查看或添加评论,请登录

Peterson Technology Partners的更多文章