Proximity Works的动态

30,496 位关注者

4 个月

OpenAI’s o1 model recently scored 120 in an IQ test — what’s the first thought on your mind? (Average humans have an IQ of about 100, and it’s nearest competitor, Claude, scored 90!) If your immediate reaction is awe or concern, read on ??

要查看或添加评论，请登录

最相关的动态

AJ Green

Founder, CEO of AI Advantage Agency AI Expert, Futurist, Pro-Human Subscribe to my newsletter for AI daily news??
6 个月
举报此动态
OpenAI has recently released o1 prompting guide. It focuses on avoiding chain-of-thought prompts, simplicity, and the use of delimiters. Here’s the full guide for you:
1 条评论
赞评论
要查看或添加评论，请登录
Andy Chan
6 个月
举报此动态
For those of you that uses LLM to quickly calculate a CRS score, the new openAI strawberry model o1 preview?does provide a more detailed answer than 4o.
赞评论
要查看或添加评论，请登录
Ozgun Erdogan
1 个月
举报此动态
Interesting blog post by Greptile's CEO! At Ubicloud, we look at both quantitative and run qualitative tests to understand how AI models perform. Since AI models can overfit to the data, they can shine in AI benchmarks but fail in qualitative ones. For example, our qualitative tests led to a disappointment with Alibaba's QwQ-32B-Preview test. Daksh Gupta evaluated OpenAI o1 and DeepSeek-R1 on real world PR data, something their company is specialized in. The results are stunning. I'd love to save you the click, but I don't want Daksh to hate me. ?? https://lnkd.in/ejX2PsiV
Daksh Gupta

CEO, Greptile - we’re hiring!
1 个月

?? New blog post! OpenAI o1 vs. DeepSeek R1: which one can catch more bugs in a pull request? We gave both models the same prompt and the same diff and asked them to find issues in a series of buggy pull request. One of the models caught nearly every known bug in the PRs. The other one caught almost *none*. Full post on our website!
2 条评论
赞评论
要查看或添加评论，请登录
Peter Chung

Applied AI & Machine Learning | Building Agents and Creativity Tools | Experimenting with Spacial Intelligence
3 个月
举报此动态
The communication around OpenAI’s latest model release have been abyssmal. This was supposed to be the complete version of their o1 inference-scaling reasoning model, their path to a general intelligence platform. All I’m seeing are misreferences to benchmark scoring and anecdotal stories that range from “this model feels dumber than previous releases” to “this feels like AGI.” My view remains unchanged. Multi-turn reasoning models, as is, are too slow, too expensive, provide answers that are too long that also still need thorough auditing. Can they get better in the future? Sure, but that remains in the future. I need to evaluate options available today. And right now that means o1 is pretty useless for the vast majority of people.
2 条评论
赞评论
要查看或添加评论，请登录
Mission LiFE

Nonprofit Government Organisation
1 个月
举报此动态
OpenAI launches o3-mini, its latest reasoning model that it says is largely on par with o1 and o1-mini in capability, but runs faster and costs less (Kyle Wiggers/TechCrunch)?

OpenAI launches o3-mini, its latest reasoning model that it says is largely on par with o1 and o1-mini in capability, but runs faster and costs less (Kyle Wiggers/TechCrunch)?
赞评论
要查看或添加评论，请登录
Stan Wepundi

Founder & CEO @ Chat Nation | Tech??Goverment| Citizen-Government Conversations| Accessing Government Services on WhatsApp
5 个月
举报此动态
OpenAI released a new o1 prompting guide. Use this to get the most accurate prompt results with OpenAI o1: Try to keep it simple Be very clear in your instructions Don't use fancy chain of thought prompts Only give it relevant information so it doesn't get distracted ?? : Superhuman
赞评论
要查看或添加评论，请登录
Farouq Aldori

Co-founder & CTO at FinetuneDB
6 个月
举报此动态
OpenAI just dropped the new o1-preview and o1-mini ?? Both models are now integrated into FinetuneDB. (You have to be an OpenAI tier 5 API users) Will share my thoughts on the new models soon!
1 条评论
赞评论
要查看或添加评论，请登录
Thomas G. Martin

CEO + Founder at LawDroid / ABA Legal Rebel + Fastcase 50 / Generative AI Speaker, Professor, Author, Philosopher, Coder, Lawyer | Subscribe to newsletter for ?? on AI + Law
6 个月
举报此动态
LawDroid is incorporating OpenAI's latest o1 model into our Copilot and Builder products for complex reasoning tasks! ?? ?? By the way, I think OpenAI is using OpenAI to answer its email replies. What do you think?
赞评论
要查看或添加评论，请登录
Arjuna Anand (Aqa)

15+ Years AI Scientist, AI Researcher (ASI + AI chips + Robotics), Specialise in CUDA programming, Building CPU performant LLMs, Guide companies to build products fast via AI, Helping CEOs in AI Transformation Mastery.
6 个月
举报此动态
Important limitations of OpenAI's new o1 model - -No multi-modal. Limited to text only, no images or file analysis. -Slower taking a minute or often several, to respond. -Not able to browse internet, so no external knowledge. -Knowledge cutoff date is October 2023. -Limited to 20 API calls per minute, limited by calls not tokens like other models.
赞评论
要查看或添加评论，请登录
Daksh Gupta

CEO, Greptile - we’re hiring!
1 个月
举报此动态
?? New blog post! OpenAI o1 vs. DeepSeek R1: which one can catch more bugs in a pull request? We gave both models the same prompt and the same diff and asked them to find issues in a series of buggy pull request. One of the models caught nearly every known bug in the PRs. The other one caught almost *none*. Full post on our website!
21 条评论
赞评论
要查看或添加评论，请登录

30,496 位关注者

查看档案关注

登录查看更多内容