登录查看更多内容

AI's Dawn of Reason

Singularity University

Futuremakers wanted.

发布日期: 2024年10月2日

OpenAI pulled the veil back on its latest AI model this month. It had long been rumored the company was working on a secret initiative, first called Q* and then Project Strawberry internally, to improve AI's reasoning abilities. The new o1 release—which scraps the GPT naming scheme in a product reset—is said to deliver on that promise.

In a blog post, the company wrote that o1 makes strides in areas heavy in reasoning where previous models, including its own GPT-4, have struggled. This includes marked improvement on benchmarks—these are often human exams given to AI—measuring o1's ability to answer questions in math, science, and coding, some at a PhD level.

OpenAI achieved its breakthrough by combining reinforcement learning—an AI approach that’s yielded impressive results in game-playing—and chain-of-thought reasoning. The latter chops difficult problems into smaller, more manageable steps and follows them through to a solution. “Through reinforcement learning, o1 learns to hone its chain of thought and refine the strategies it uses,” OpenAI wrote in its blog post. “It learns to recognize and correct its mistakes. It learns to break down tricky steps into simpler ones. It learns to try a different approach when the current one isn’t working.”

The model ranked in the top 11 percent in competitive coding, scored well enough to qualify for the Math Olympiad, a math competition for high school students, and exceeded human PhDs in a benchmark measuring knowledge in advanced physics, biology, and chemistry. It also significantly outperformed GPT-4o in these areas. But notably, OpenAI wrote, it may not match its predecessor in tasks that are more strictly limited to language.

Exactly how much such benchmarks can tell us about AI’s abilities, beyond showing how models compare to each other and prior generations of AI, is hotly debated. Critics say they fall short in some areas, like the quality of the test itself or whether exact or similar questions, answers, and knowledge exists online and therefore in each model's training data. Further, if all models perform similarly on existing benchmarks, we'll need new ones. Fortunately, there are already efforts afoot to make harder, more illustrative AI tests.

Still, the ability to perform multi-step reasoning has long been a goal in the industry, and o1 appears to be progress. Google DeepMind is also going after AI that can reason. DeepMind’s AlphaGeometry mashed together a large language model and a symbolic model—a more traditional, hard-coded approach—to match top high schoolers at geometry. DeepMind’s CEO, Demis Hassabis, has also said they’re looking to use reinforcement learning, which is their “bread and butter,” to improve future models.

Crucially, o1 proves AI can progress without resorting to scaling, in which developers improve models by making them bigger. That said, this month also showed scaling will continue, as players moved to secure cash and energy.

OpenAI’s release of o1 alongside its advanced voice mode coincided with reports the company is raising new funds from investors at an eye-opening $150 billion valuation, nearly twice the company’s valuation around this time last year. Anthropic is also said to be in the midst of its own funding round with a potential valuation of $40 billion.?

While both companies are bringing in revenue and generative AI’s user base is growing fast—OpenAI’s has doubled in the last year—it’s not enough to keep up with operations and the ballooning costs of training next-generation AI models. In addition to new funding rounds, a coalition including Microsoft, BlackRock, Global Infrastructure Partners, and MGX announced efforts to raise an astonishing $100 billion to build out AI infrastructure—$30 billion in private equity capital and the rest via debt financing.

Assuming investment continues at this pace, a report from Epoch AI explored whether scaling is technically even feasible. Will we be able to find enough of the primary inputs—power, chips, and data—to maintain AI scaling at the current rate? The report found that, yes, it is technically possible to scale models by least 10,000x over OpenAI’s GPT-4 through 2030. The biggest sticking point: Powering the coming quantum leap in data centers.

It’s no surprise, then,?that Sam Altman reportedly pitched the US government on plans to build several five-gigawatt data centers to be located around the country. Five gigawatts, Bloomberg writes, is like five nuclear plants powering three million homes. Speaking of which, Microsoft has also announced plans to reopen Pennsylvania’s Three Mile Island nuclear plant.

There’s clearly financial will to forge ahead for the time being. The level of investment reflects the size of the opportunity big tech believes is possible. As cash plowed into AI runs into the hundreds of billions; leaders think the return could be in the trillions.?

Future investment will depend on how long scaling bears fruit. The next generation of models must show clear advances over this generation. It’s also possible new breakthroughs outside scaling, like o1, will show we can do more with less. For now, though, the path is laid out. Tech is going after big AI.

Amr Saafan 8 个月前

Should Open-Source AI Prioritize Developing Foundation…

Lightning AI 1 年前

Anthropic Raises the Bar with Claude 3

Lightning AI 8 个月前

More News From the Future

Meta’s Orion AR glasses: A face computer you might actually wear.

AR Star. At its annual developer conference Meta showed off its new Orion AR glasses. The glasses include a display with a 70-degree field of view—one of the largest in the industry—and a wristband controller. They also pack sensors for eye- and hand-tracking, cameras, speakers, microphones, WiFi, and AI capabilities. All this is embedded in what looks like a beefy pair of black-rimmed glasses. They don’t quite match a standard pair, but they’re much closer than prior iterations of the technology. This does mean computing, power, and connectivity have to be offloaded to a phone-sized puck, however. Still, reviewers have been impressed. The device looks to be a big step toward true AR glasses.

Bigger ≠ better. Meta’s founder and CEO Mark Zuckerberg believes face computers will be the next great computer interface. This is why he renamed the company Meta, pumped up the metaverse during the pandemic, and has spent some $50 billion in the space in recent years. But stubborn technical challenges have held that vision back: Cost, quality, and form factor. Orion is particularly significant for its progress in the latter. No one, not even Apple, has credibly moved beyond the “loaf of bread strapped to your face” phase. Orion’s glasses suggest future AR devices roughly the size, shape, and style of a pair of sunglasses might include a quality AR display, audio, mics, cameras, and AI connectivity for an all-in-one face device you might actually want to wear.

Proto-product. Let’s get to the caveats. Orion is not a rough prototype, but it’s not a consumer-ready device either. Remember that bit about cost? The glasses cost as much as $10,000 to produce. That’s nearly three times more than the Apple Vision Pro’s retail price, which is already way too expensive. The company may have to compromise on quality and capability to get the price down. When an Orion-inspired consumer product eventually launches, it will necessarily be different. In the meantime, Meta sees its AR glasses on a continuum. Already, available Ray-Ban models boast cameras and AI. The next iteration might add a limited display suitable for text. The long game is something like Orion, or better. Meanwhile, others, including Apple and Snap, which released its own AR glasses this month, will try to beat them to the punch.

Startups are attracting billions of dollars to go after fusion power.

A startling trend is underway in the pursuit of fusion power: Once the sole province of multi-billion-dollar government research projects, startups are now taking a crack at the technology. More powerful magnets, greater computing power, AI-controlled systems, better simulations, and the achievement of scientific breakeven at the US National Ignition Facility are all reasons investors have pumped over $7 billion into private fusion projects, according to TechCrunch. Five startups—Commonwealth Fusion Systems, General Fusion, Helion, TAE, and Zap Energy—have each raised at least $300 million.

Now operating 100,000 robotaxi rides a week, Waymo eyes expansion.

The number of weekly rides Waymo’s self-driving fleet completes has increased by a factor of 10 in the last year—and it's doubled since May alone. This is an impressive stat after years of overly optimistic predictions in the space. Now, the SF Chronicle reports, Waymo wants to expand further in the nearly 8-million-strong Bay Area. The company has approval to begin operations in 22 cities along the peninsula south of San Francisco. It also has plans to service the airport and move into San Jose and the East Bay. All this will be connected by highway travel, which Waymo began testing with employees in August. The company must still maintain the delicate balance between safety and growth—any misstep will be costly—but at least for now, they’ve managed to prove out and scale robotaxis far more than anyone else.

New tech based on fracking could grow geothermal power 100x.

Although our planet’s interior contains incredible amounts of heat, geothermal power is still niche, confined to areas where hot, permeable rock and water combine naturally. Enhanced geothermal systems aim to make the power source viable nearly anywhere in the world. The idea is to use technology developed for fracking to drill into hot rock, fracture it, pump in water, and then recover the heated water to turn turbines. It’s expensive, but companies are working the problem, and already, costs are coming down. The potential is huge: In just the US, EGS could theoretically expand geothermal power resources 100x, from 40 GW with conventional geothermal to 5,500 GW.

Upcoming Events

Apply for our November Executive Program?

View the expert lineup for our November 10-14 Executive Program. Seats are filling quickly, so explore the program and start your application today.

Ready to apply? Futureproof yourself.

Thanks for reading. We hope you enjoyed this month's updates and found something to inspire you on your exponential journey. Join our global community of over 200,000 futuremakers - sign-up and unlock early access to the Singularity Monthly newsletter and discover the most impactful technology breakthroughs - get a glimpse of tomorrow, today.

See you next month!

The Singularity Team?

Hrijul Dey