登录查看更多内容

Summary of what we have learned during AMA hour with the OpenAI o1 team on 2024-09-13

Tibor Blaho

Lead Engineer at AIPRM.com and LinkResearchTools.com

发布日期: 2024年9月13日

+ 关注

Model Names and Reasoning Paradigm

- OpenAI o1 is named to represent a new level of AI capability; the counter is reset to 1

- "Preview" indicates it's an early version of the full model

- "Mini" means it's a smaller version of the o1 model, optimized for speed

- o - as OpenAI

- o1 is not a "system"; it's a model trained to generate long chains of thought before returning a final answer

- The icon of o1 is metaphorically an alien of extraordinary ability

Size and Performance of o1 Models

- o1-mini is much smaller and faster than o1-preview, hence offered to free users in future

- o1-preview is an early checkpoint of the o1 model, neither bigger nor smaller

- o1-mini performs better in STEM tasks, but has limited world knowledge

- o1-mini excels at some tasks, especially in code-related tasks, compared to o1-preview

- Input tokens for o1 are calculated the same way as GPT-4o, using the same tokenizer

- o1-mini can explore more thought chains compared to o1-preview

Input Token Context and Model Capabilities

- Larger input contexts are coming soon for o1 models

- o1 models can handle longer, more open-ended tasks with less need for chunking input compared to GPT-4o

- o1 can generate long chains of thought before providing an answer, unlike previous models

- There is no current way to pause inference during CoT to add more context, but this is being explored for future models

Tools, Functionality, and Upcoming Features

- o1-preview doesn't use tools yet, but support for function calling, code interpreter, and browsing is planned

- Tool support, structured outputs, and system prompts will be added in future updates

- Users might eventually get control over thinking time and token limits in future versions

- Plans are underway to enable streaming and considering reasoning progress in the API

- Multimodal capabilities are built into o1, aiming for state-of-the-art performance in tasks like MMMU

CoT (Chain of Thought) Reasoning

- o1 generates hidden chains of thought during reasoning

- No plans to reveal CoT tokens to API users or ChatGPT

- CoT tokens are summarized, but there is no guarantee of faithfulness to the actual reasoning

- Instructions in prompts can influence how the model thinks about a problem

- Reinforcement learning (RL) is used to improve CoT in o1, and GPT-4o cannot match its CoT performance through prompting alone

- Thinking stage appears slower because it summarizes the thought process, even though answer generation is typically faster

领英推荐

The Weekend @ ...

Generative AI 1 年前

The future of AI isn’t the model—it’s the system

Fast Company 2 周前

This AI newsletter is all you need #95

Towards AI 11 个月前

API and Usage Limits

- o1-mini has a weekly rate limit of 50 prompts for ChatGPT Plus users

- All prompts count the same in ChatGPT

- More tiers of API access and higher rate limits will be rolled out over time

- Prompt caching in the API is a popular request, but no timeline is available yet

Pricing, Fine-tuning, and Scaling

- Pricing of o1 models is expected to follow the trend of price reductions every 1-2 years

- Batch API pricing will be supported once rate limits increase

- Fine-tuning is on the roadmap, but no timeline is available yet

- Scaling up o1 is bottlenecked by research and engineering talent

- New scaling paradigms for inference compute could bring significant gains in future generations of models

- Inverse scaling isn't significant yet, but personal writing prompts show o1-preview performing only slightly better than GPT-4o (or even slightly worse)

Model Development and Research Insights

- o1 was trained using reinforcement learning to achieve reasoning performance

- The model demonstrates creative thinking and strong performance in lateral tasks like poetry

- o1's philosophical reasoning and ability to generalize, such as deciphering ciphers, are impressive

- o1 was used by researchers to create a GitHub bot that pings the right CODEOWNERS for review

- In internal tests, o1 quizzed itself on difficult problems to gauge its capabilities

- Broad world domain knowledge is being added and will improve with future versions

- Fresher data for o1-mini is planned for future iterations of the model (Oct 2023 currently)

Prompting Techniques and Best Practices

- o1 benefits from prompting styles that provide edge cases or reasoning styles

- o1 models are more receptive to reasoning cues in prompts compared to earlier models

- Providing relevant context in retrieval-augmented generation (RAG) improves performance; irrelevant chunks may worsen reasoning

General Feedback and Future Enhancements

- Rate limits are low for o1-preview due to early-stage testing but will be increased

- Improvements in latency and inference times are actively being worked on

Remarkable Model Capabilities

- o1 can think through philosophical questions like "What is life?"

- Researchers found o1 impressive in its ability to handle complex tasks and generalize from limited instruction

- o1's creative reasoning abilities, such as quizzing itself to gauge its capabilities, showcase its high-level problem-solving

Rafael Bittencourt

Computer Scientist, Data Scientist, Artificial Intelligence Specialist, and Bachelor of Laws Auditor at the Federal Court of Accounts - Brazil (TCU) and External Auditor for UN/UNICEF

6 个月

Any comments about the future of GPTo series? Next model GP5o?

1 次回应

查看更多评论

要查看或添加评论，请登录

Tibor Blaho的更多文章

Claude "web_search_tool"

2025年3月21日

Claude "web_search_tool"

Source: Claude

2 条评论
Summary of AMA with OpenAI's Sam Altman, Kevin Weil, Srinivas Narayanan, and Mark Chen on Reddit on 2024-10-31

2024年10月31日

Summary of AMA with OpenAI's Sam Altman, Kevin Weil, Srinivas Narayanan, and Mark Chen on Reddit on 2024-10-31

GPT-5 and Upcoming Models - No plans to release a model named GPT-5 this year, though there are significant releases…
Short summary of TED Talks at TED AI 2024 in Vienna on Oct 19, 2024

2024年10月20日

Short summary of TED Talks at TED AI 2024 in Vienna on Oct 19, 2024

Plug and Play? - Chin-Teng Lin, Co-Director at the Australian AI Institute - demonstrated EEG-to-text brain-computer…

1 条评论
Summary of the OpenAI DevDay 2024 fireside chat with CEO Sam Altman and CPO Kevin Weil

2024年10月2日

Summary of the OpenAI DevDay 2024 fireside chat with CEO Sam Altman and CPO Kevin Weil

AGI Progress and Timelines - OpenAI believes AGI development is progressing rapidly with significant advancements…
Claude's prompt for the "Harmony" directory sync - key operations and the "YOU WILL BE FIRED" rule

2024年9月4日

Claude's prompt for the "Harmony" directory sync - key operations and the "YOU WILL BE FIRED" rule

This prompt seems to be used by Claude for "Harmony" (directory sync) - "YOU WILL BE FIRED IF ..
The Future of AI: Key Takeaways from Sam Altman's Conversation with Lex Fridman

2024年3月18日

The Future of AI: Key Takeaways from Sam Altman's Conversation with Lex Fridman

Sam Altman, CEO of OpenAI, recently sat down with Lex Fridman to discuss the rapid progress of AI, the challenges…

1 条评论
16 Cool AI Ideas for Christmas

2023年12月13日

16 Cool AI Ideas for Christmas

?? Discovering Christmas songs & making playlists? AI does it in a flash! ??? Need to cook for different diets? AI's…
ChatGPT Prototype - Gizmo V8

2023年11月2日

ChatGPT Prototype - Gizmo V8

Preview of the New ChatGPT Prototype - Gizmo V8 Featuring "Magic Maker" (GPT-4 Magic Create?) and Customizable "GPTs"…

18 条评论
Enjoyed the "Teaching with AI" tips from OpenAI? Want to make teaching with AI even more organized and accessible?

2023年8月31日

Enjoyed the "Teaching with AI" tips from OpenAI? Want to make teaching with AI even more organized and accessible?

Enhance the experience with AIPRM for ChatGPT and start for free: 1. Save your prompts as team prompt templates in…
Tutorial: How can I use ChatGPT Code Interpreter to run Go, Rust, or even PHP?

2023年8月26日

Tutorial: How can I use ChatGPT Code Interpreter to run Go, Rust, or even PHP?

1. Download the required packages such as Go, Rust, PHP, and others.

See all articles

Summary of what we have learned during AMA hour with the OpenAI o1 team on 2024-09-13

Tibor Blaho

Lead Engineer at AIPRM.com and LinkResearchTools.com

领英推荐

Tibor Blaho的更多文章

社区洞察

其他会员也浏览了

#22 Cache-Augmented Generation (CAG): Revolutionizing AI Efficiency, by replacing RAG?

Dave Talks to GPT 4.5

Optimizing Prompts for Reasoning LLMs

Part II: How the ‘Fourth Surge’ of the ‘Double Helix of Data’ Became a Torrent of Innovation

Artificial Intelligence #154

Real Time Lessons from DeepSeek's Disruption of the Market Landscape

Deepseek R1 - The Good, The Bad, and The Cloudy

Tech Insights 2025 Week 7

AI/ML news summary: week 22

AI-ming for the stars

领英推荐

Tibor Blaho的更多文章

Claude "web_search_tool"

Summary of AMA with OpenAI's Sam Altman, Kevin Weil, Srinivas Narayanan, and Mark Chen on Reddit on 2024-10-31

Short summary of TED Talks at TED AI 2024 in Vienna on Oct 19, 2024

Summary of the OpenAI DevDay 2024 fireside chat with CEO Sam Altman and CPO Kevin Weil

Claude's prompt for the "Harmony" directory sync - key operations and the "YOU WILL BE FIRED" rule

The Future of AI: Key Takeaways from Sam Altman's Conversation with Lex Fridman

16 Cool AI Ideas for Christmas

ChatGPT Prototype - Gizmo V8

Enjoyed the "Teaching with AI" tips from OpenAI? Want to make teaching with AI even more organized and accessible?

Tutorial: How can I use ChatGPT Code Interpreter to run Go, Rust, or even PHP?

社区洞察

其他会员也浏览了

#22 Cache-Augmented Generation (CAG): Revolutionizing AI Efficiency, by replacing RAG?

Dave Talks to GPT 4.5

Optimizing Prompts for Reasoning LLMs

Part II: How the ‘Fourth Surge’ of the ‘Double Helix of Data’ Became a Torrent of Innovation

Artificial Intelligence #154

Real Time Lessons from DeepSeek's Disruption of the Market Landscape

Deepseek R1 - The Good, The Bad, and The Cloudy

Tech Insights 2025 Week 7

AI/ML news summary: week 22

AI-ming for the stars