OpenAI has o1

AIM Events

Hosting the World’s Most Impactful AI Conferences & Events. For Brand collaborations write to [email protected]

发布日期: 2024年9月13日

Today, OpenAI introduced its new o1-preview series of AI models, which can solve complex problems in areas such as science, coding, and maths. The models are now available in ChatGPT and the API, as part of an early preview, with regular updates and improvements expected.

“Extremely proud of the team; this was a monumental effort across the entire company. Hope you enjoy it!” posted OpenAI chief Sam Altman on X. He even wrapped up his inside joke with AI insider Jimmy Apples by saying, “No more patience, Jimmy,” to which Apples replied, “It feels good, Sam. Really good.”

“We trained a model and it is good in some things,” said OpenAI’s Jerry Tworek. To this, Altman said, “But would you rather be a little sad most of the time and super happy occasionally, or a little happy all of the time and very sad once in a while?,” subtly hinting at OpenAI’s o1 reasoning capabilities to solve complex human emotions.?

The o1 series models are trained to spend more time thinking before responding, refining their reasoning process and improving problem-solving capabilities. In initial tests, the next update of the reasoning model performed on par with PhD students on physics, chemistry, and biology tasks, achieving notable success in maths and coding competitions. In a qualifying exam for the International Mathematics Olympiad, the model scored 83%, compared to GPT-4o’s 13%.

Despite its advanced reasoning abilities, the o1-preview model lacks some of the practical features found in GPT-4o, such as web browsing and file uploading. However, OpenAI emphasises the model’s potential for tackling complex tasks, particularly in fields requiring multi-step workflows.

As part of the release, OpenAI has implemented a new safety training approach that allows the models to follow safety rules better. In jailbreaking tests, o1-preview outperformed GPT-4o, scoring 84 out of 100, compared to GPT-4o’s 22. OpenAI has also bolstered its safety efforts by partnering with AI safety institutes in the US and UK.

Alongside o1-preview, OpenAI has released a smaller, cost-effective model called o1-mini, designed specifically for developers who need advanced coding capabilities without broad world knowledge. o1-mini is 80% cheaper than o1-preview.

Starting today, ChatGPT Plus and Team users can manually select o1-preview and o1-mini from the model picker, with rate limits of 30 messages for o1-preview and 50 for o1-mini. API users in the highest usage tier can also begin prototyping, although some features like function calling and streaming are not available yet.

OpenAI plans to expand access to o1-mini for ChatGPT free users and will continue adding new features to the o1 series, including browsing and file uploads.

NVIDIA’s Jim Fan lauded OpenAI o1 for its focus on inference-time scaling rather than model size. He emphasised that large models are not necessary for reasoning, as reasoning can be separated from knowledge using a “small reasoning core” and tools like code verifiers.

“You don’t need a huge model to perform reasoning... a small ‘reasoning core’ that knows how to call tools like browser and code verifier can factor out reasoning from knowledge,” he added.?

Devin’s creator, Cognition Labs, worked closely with OpenAI over the past few weeks to evaluate OpenAI o1’s reasoning capabilities with Devin. They found that the new models represented a significant improvement for agentic systems that dealt with code.

A few days earlier, in a cryptic post, Altman had hinted that the company was working on a project internally known as Project Strawberry, also referred to as Q*.?

Data Science Dojo 7 个月前

Startups and Developers are Rushing to use OpenAI's…

Michael Spencer 2 年前

TimeGPT-1 Foundation Model For Time Series; Merge…

Danny Butvinik 8 个月前

“I love summer in the garden,” wrote Altman on X, posting the image of a terracotta pot containing a strawberry plant with lush green leaves and small, ripening strawberries.

Project Strawberry was said to significantly enhance the reasoning capabilities of OpenAI’s AI models. It is pretty clear that the o1-preview is exclusively Strawberry.?

Meanwhile, OpenAI is in talks to raise up to $7 billion, potentially valuing the company at $150 billion, with investment interest from UAE’s MGX, Microsoft, NVIDIA, and Apple.?

Can o1 Save GitHub Copilot?

Ever since Cursor and Claude hit the market, developers have been slowly moving away from GitHub Copilot. According to sources, Microsoft has plans to upgrade its capabilities on the VS Code IDE, which would help it compete with Cursor. But what about GitHub Copilot?

GitHub CEO Thomas Dohmke is optimistic. He posted on X a video of GitHub Copilot in VS Code running with OpenAI’s o1 model, which he calls “flat out badass”. The new model has been integrated into GitHub Copilot and is making AI pair programming a lot smarter.

Meanwhile, developers have also started implementing o1 within Cursor Composer and have already started creating apps. Cursor being a fork of VS Code, enables much more flexibility when it comes to integrating LLMs within it, making it ideal for several developers.?

The competition now seems to be head-on between Cursor and GitHub Copilot as both can now run on o1, which according to developers, is currently performing better than Claude in certain use cases. Enjoy the full story here.

AMD Tries to Break NVIDIA’s CUDA Ecosystem with UDNA

AMD has announced a significant shift in its GPU architecture strategy with the introduction of UDNA (Unified Data and Neural Architecture). This new architecture aims to merge AMD’s existing RDNA (for gaming) and CDNA (for data centres) architectures into a single, unified platform.

However, users allege that AMD has been partial in providing support, and is more inclined to providing better support to CDNA. RDNA requires per-generation optimisation. Due to this reason, AMD has to put a lot more effort into RDNA users. Read on.?

AI Bytes?

Google has introduced DataGemma, a new open model that integrates LLMs with real-world data from its Data Commons repository, using retrieval-augmented methods like RIG and RAG to reduce AI hallucinations and improve the accuracy of generative AI outputs in research and decision-making contexts.
Baidu has rebranded its ERNIE Bot as Wenxiaoyan, bringing advanced AI-driven search capabilities into its chatbot, allowing users to search for music, maps, articles, and more, while integrating features like personalised content scheduling, multimedia search, and expert advice, making it a popular choice among young users with over ten million monthly active users.
AWS has selected seven Indian startups—Converse, House of Models, Neural Garage, Orbo.ai, Phot.ai, Unscript AI, and Zocket—for its Global Generative AI Accelerator program, offering up to $1 million in credits, mentorship, and technical support to scale their AI innovations.?

OpenAI has o1

AIM Events

Hosting the World’s Most Impactful AI Conferences & Events. For Brand collaborations write to [email protected]

领英推荐

Sector 6

6,347 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

H2OGPT Open-source Project; LLMs as Debugger; GPT-5 What can be Expected; New 1Bn LLM by Microsoft; In Growth Zone: Creative Teams; and More

AI-Powered Autocomplete Lets you Code in Natural Language

Copy of OpenAI update: Strawberry is live, the subscription fees, and the hunt for cash

How to Master OpenAI: A Comprehensive Guide OpenAI is a leading force in the field of artificial intelligence, with its models and tools transforming

The Software Industry's "Kodak Moment" - When Code Writes Itself

Building Apps with OpenAI's Products

OpenAI's o1 Model: Einstein in a Box - A Breakthrough in AI Reasoning

??Top ML Papers of the Week

OpenAI – The AI That Can be Life-changing

OpenAI o1 Is Out: Embracing Inference-Time Scaling and the Future of AI Reasoning

领英推荐

Sector 6

6,347 位关注者

Can ChatGPT Really Think?

2024年9月14日

Generating Novel Research Ideas Using LLMs

2024年9月12日

If the ‘Apple Glow’ Event Were an Email...

2024年9月11日

This is a Story About Self-Reflection, Not AI Fraud

2024年9月10日

Replit Agents: Cursor Who?

2024年9月7日

Indian IT’s Got 99 Problems, and GenAI is All of Them

2024年9月6日

State vs State: Who’s Slaying the AI Game in India?

2024年9月5日

Is this the End of IDEs?

2024年9月4日

The End of the App Store?

2024年9月3日

Solo Hustlers Flipping Billions ??

2024年8月30日