OpenAI's o1 Outperforms Other LLMs By "Stopping To Think," & More

OpenAI's o1 Outperforms Other LLMs By "Stopping To Think," & More


Want more research from the ARK Team? Have feedback on our publications? Click here to help inform our content creation.


1. OpenAI's o1 Outperforms Other LLMs By "Stopping To Think"

By: Frank Downing

The output of existing large language models (LLMs) is likely a “gut” response from a disembodied symbolic system that blurts out the first content that seems to answer the question. Unfortunately, whether human or LLM, the first answer to a question might not be the most enlightened. A thoroughly reasoned answer can take both thought and time.

OpenAI's latest model—pithily named “o1”—seeks to address that problem with a new approach to "reasoning" that involves methods similar to chain-of-thought prompting or to the process of forcing a model to decompose the problem and then work through it step by step.

Given a prompt, o1 will pause anywhere from a few seconds to a few minutes to explore and iterate possible paths to a “thoughtful” response. OpenAI's benchmark results already are demonstrating that o1’s new reasoning capability has improved model performance significantly compared to prior models, particularly in coding, math, and scientific tasks.

Importantly, the quality of o1’s answers tends to improve based on the time it spends thinking during inference, as shown on the left below. So-called "test-time" performance resembles the typical scaling laws of model training, suggesting that model performance increases predictably as training compute increases.

Note: The x-axes above are indicative of compute-time spent training (left chart) and running (right chart) the model, on a log scale. OpenAI did not provide absolute units, possibly to prevent the accidental release of proprietary information. Regardless of the specific time measurement, the charts are meant to illustrate that longer inference/training time has resulted in consistently better performance. Source: OpenAI 2024.[1] For informational purposes only and should not be considered investment advice or a recommendation to buy, sell, or hold any particular security. Past performance is not indicative of future results.

In addition to its intriguing performance, o1 could have important business model ramifications. OpenAI’s new approach should incentivize heavy use of its inference application programming interfaces (APIs), with performance improvements depending more on COGS (cost of goods sold)—an operating expense—than the capital spending associated with R&D (research and development).

User interactions with o1's reasoning capabilities, in turn, could create a valuable dataset of thinking patterns to help OpenAI improve all its models continuously. Citing both competitive and safety concerns, the company is hiding the text generated during the reasoning process from the end user. Clearly, OpenAI’s management team recognizes the high value of that data.


2. Waymo And Uber Are Expanding Their Partnership Into Austin And Atlanta

By: Daniel Maguire, ACA

Last week, Waymo and Uber announced that they will expand their partnership to Austin and Atlanta in early 2025.[2] Unlike their previous, non-exclusive partnership in Phoenix, Waymo vehicles in Austin and Atlanta will be available exclusively on the Uber app.[3] Notably, Atlanta is the fifth city in which Waymo will operate, marking rapid expansion since testing began there in April.[4] Uber also will provide fleet management services, including vehicle cleaning, repair, and other depot operations, services it had not offered previously.

In addition to other self-driving partnerships and investments,[5] Uber’s increased commitment to its partnership with Waymo signals its renewed ambition to participate in an electric-robotaxi future, a marked shift in strategy since the sale of its self-driving unit to Aurora Innovation in 2020.[6]

We monitor the robotaxi opportunity continuously and look forward to future announcements, particularly those during Tesla’s robotaxi launch event in October.


3. Will olivia, Tempus AI’s New App, Empower Patients In Their Healthcare Journeys?

By: Nemo Marjanovic, PhD

Last week, Tempus AI launched[7] its new app, olivia, to address patients’ inability to manage and leverage their own health data. The current healthcare system is fragmented, especially for patients with chronic conditions who must juggle medical records from providers, specialties, and systems. The fragmentation not only complicates patient care but also obstructs critical insights that could improve health outcomes. Tempus AI’s olivia cuts through the complexity by aggregating health records, tracking symptoms, and preparing patients for doctor appointments.

Will addressing the bottlenecks that burden patients pose challenges to the current healthcare system? That question lies at the heart of olivia’s approach to transforming patient care. The disruptive force within olivia is generative AI. By leveraging AI’s capabilities, olivia moves beyond simple data aggregation by analyzing the data and providing personalized insights that invite users to ask real-time questions about their health. Allowing patients to understand their health holistically, without the need for deep medical expertise, could change the paradigm in healthcare.

We look forward to monitoring whether olivia’s cutting-edge AI solution will redefine how patients interact with the healthcare system. Will this technology overcome the systemic resistance and compel patients to take ownership of their healthcare data?


Want more research from the ARK Team? Have feedback on our publications? Click here to help inform our content creation.


[1] OpenAI. 2024. “Learning to Reason with LLMs.”

[2] Waymo, 2024. “Waymo and Uber Expand Partnership to Bring Autonomous ride-Hailing to Austin and Atlanta.” Waymo Waypoint Blog.

[3] Waymo. 2023. “The Waymo Driver: Now Available on Uber in Phoenix.” Waymo Waypoint Blog.

[4] Waymo. 2024. “Winter in Buffalo, Spring in DC, and Summer in Atlanta…” X.

[5] Cruise. 2024. “ Uber and Cruise to Deploy Autonomous Vehicles on the Uber Platform.” See also Waymo. 2024. “Winter in Buffalo, Spring in DC, and Summer in Atlanta…” X. See also Carey, N. 2024. “Uber Invests in UK Startup Wayve to Speed Up Self-Driving Projects.” Reuters, via Yahoo! Finance.

[6] Uber Investor. 2020. “Aurora is acquiring Uber’s self-driving unit, Advanced Technologies Group, accelerating development of the Aurora Driver.”

[7] Tempus AI. 2024. “Tempus Launches Beta Version of olivia, its AI-enabled Personal Health Concierge App for Patients.”

要查看或添加评论,请登录

社区洞察

其他会员也浏览了