登录查看更多内容

GPT-4o mini from OpenAI: The Compact AI Powerhouse Outperforming Competitors

Denis Kondratev

Stream CTO of HR Tech, MTS

发布日期: 2024年7月18日

OpenAI has just released GPT-4o mini, their most cost-efficient compact AI model. And guess what? This little beast is beating GPT-4 in MMLU tests and on the LMSYS leaderboard. And it costs peanuts – just 15 cents per million input tokens and 60 cents per million output tokens. That's an order of magnitude cheaper than previous state-of-the-art models and over 60% cheaper than GPT-3.5 Turbo.

GPT-4o mini outperforms GPT-3.5 Turbo and other compact models in academic tests of textual intelligence and multimodal reasoning. In text-visual reasoning tasks, GPT-4o mini scores 82% on MMLU, compared to a mere 77.9% for Gemini Flash and 73.8% for Claude Haiku.

This little powerhouse also crushes previous compact models in math and coding. In the MGSM test of mathematical reasoning, GPT-4o mini scored 87%, while Gemini Flash and Claude Haiku only managed 75.5% and 71.7%, respectively. In the HumanEval coding performance test, GPT-4o mini scored 87.2%, versus 71.5% for Gemini Flash and 75.9% for Claude Haiku.

Plus, GPT-4o mini demonstrates excellent performance in the MMMU test of multimodal reasoning – 59.4% compared to 56.1% for Gemini Flash and 50.2% for Claude Haiku. And it significantly outperforms GPT-3.5 Turbo in tasks like extracting structured data from receipts or generating high-quality email responses considering the conversation history.

GPT-4o mini has the same built-in safety measures as GPT-4o, rigorously tested both automatically and manually. The GPT-4o mini API is the first model to use OpenAI's instruction hierarchy method, protecting against jailbreaks, prompt injections, and system prompt extractions. This makes the model's responses more reliable and safer for use in applications at scale.

GPT-4o mini is now available as a text and vision model in the Assistants API, Chat Completions API, and Batch API. Fine-tuning will be added soon.

ChatGPT Free, Plus, and Team users can access GPT-4o mini today instead of GPT-3.5. Enterprise customers will also get access next week – OpenAI wants to make AI benefits accessible to everyone.

So, if you need a cheap, powerful, and safe AI, take a close look at GPT-4o mini. Competitors are nervously watching from the sidelines.

Перепиши этот текст на английский язык для публикации в LinkedIn и придумай подходящий заголовок

Компания OpenAI представила GPT-4o mini - свою самую экономичную компактную модель искусственного интеллекта. GPT-4o mini набирает 82% баллов в тесте MMLU и в настоящее время превосходит GPT-4 на доске лидеров LMSYS. Цена модели составляет 15 центов за миллион токенов на входе и 60 центов за миллион токенов на выходе, что на порядок дешевле, чем у предыдущих передовых моделей, и более чем на 60% дешевле, чем у GPT-3.5 Turbo.

GPT-4o mini превосходит GPT-3.5 Turbo и другие компактные модели по академическим показателям как текстового интеллекта, так и мультимодальных рассуждений. На задачах рассуждения, связанных с текстом и изображениями, GPT-4o mini набирает 82,0% баллов по тесту MMLU по сравнению с 77,9% у Gemini Flash и 73,8% у Claude Haiku.

GPT-4o mini также превосходит предыдущие компактные модели на рынке в задачах математических рассуждений и программирования. В тесте MGSM, измеряющем математические рассуждения, GPT-4o mini набрал 87,0% по сравнению с 75,5% у Gemini Flash и 71,7% у Claude Haiku. В тесте HumanEval, измеряющем производительность кодирования, GPT-4o mini набрал 87,2% по сравнению с 71,5% у Gemini Flash и 75,9% у Claude Haiku.

领英推荐

Mysterious GPT is Back...

Steve Nouri 6 个月前

? OpenAI’s battle of o1 vs GPT-4

Lex Sokolin 2 个月前

ODSC's AI Weekly Recap: Week of May 10th

Open Data Science Conference (ODSC) 6 个月前

Кроме того, GPT-4o mini демонстрирует высокую производительность в тесте мультимодальных рассуждений MMMU, набирая 59,4% по сравнению с 56,1% у Gemini Flash и 50,2% у Claude Haiku. GPT-4o mini работает значительно лучше, чем GPT-3.5 Turbo, для таких задач, как извлечение структурированных данных из файлов квитанций или создание высококачественных ответов на электронные письма при предоставлении истории переписки.

GPT-4o mini имеет те же встроенные меры безопасности, что и GPT-4o, которые были тщательно оценены с помощью автоматических и ручных проверок. API GPT-4o mini является первой моделью, в которой применяется метод иерархии инструкций OpenAI, помогающий улучшить способность модели противостоять взломам, инъекциям подсказок и извлечению системных подсказок. Это делает ответы модели более надежными и помогает сделать ее более безопасной для использования в приложениях в масштабе.

GPT-4o mini теперь доступен как текстовая и визуальная модель в API Assistants, API Chat Completions и Batch API. В ближайшие дни планируется внедрить тонкую настройку для GPT-4o mini.

В ChatGPT пользователи Free, Plus и Team смогут получить доступ к GPT-4o mini уже сегодня вместо GPT-3.5. Корпоративные пользователи также получат доступ, начиная со следующей недели, в соответствии с миссией OpenAI по обеспечению доступности преимуществ ИИ для всех.

Title: OpenAI Introduces GPT-4o mini: The Cost-Effective and High-Performing AI Model

OpenAI has unveiled GPT-4o mini, their most cost-efficient compact artificial intelligence model. GPT-4o mini achieves an impressive 82% score on the MMLU test and currently outperforms GPT-4 on the LMSYS leaderboard. The model is priced at just 15 cents per million input tokens and 60 cents per million output tokens, making it an order of magnitude more affordable than previous state-of-the-art models and over 60% cheaper than GPT-3.5 Turbo.

GPT-4o mini outshines GPT-3.5 Turbo and other compact models in academic benchmarks for both textual intelligence and multimodal reasoning. In text and image-related reasoning tasks, GPT-4o mini scores 82.0% on the MMLU test, compared to 77.9% for Gemini Flash and 73.8% for Claude Haiku.

Moreover, GPT-4o mini surpasses previous compact models in the market in mathematical reasoning and programming tasks. In the MGSM test, which measures mathematical reasoning, GPT-4o mini scored 87.0% compared to 75.5% for Gemini Flash and 71.7% for Claude Haiku. In the HumanEval test, which assesses coding performance, GPT-4o mini achieved 87.2% compared to 71.5% for Gemini Flash and 75.9% for Claude Haiku.

Additionally, GPT-4o mini demonstrates high performance in the MMMU multimodal reasoning test, scoring 59.4% compared to 56.1% for Gemini Flash and 50.2% for Claude Haiku. GPT-4o mini significantly outperforms GPT-3.5 Turbo in tasks such as extracting structured data from receipt files or generating high-quality email responses when provided with conversation history.

GPT-4o mini incorporates the same built-in safety measures as GPT-4o, which have been thoroughly evaluated through automatic and manual checks. The GPT-4o mini API is the first model to employ OpenAI's instruction hierarchy method, enhancing the model's ability to resist jailbreaks, prompt injections, and system prompt extractions. This makes the model's responses more reliable and safer for use in applications at scale.

GPT-4o mini is now available as a text and visual model in the Assistants API, Chat Completions API, and Batch API. Fine-tuning for GPT-4o mini is planned to be implemented in the coming days.

In ChatGPT, Free, Plus, and Team users will be able to access GPT-4o mini starting today, replacing GPT-3.5. Enterprise users will also gain access starting next week, in line with OpenAI's mission to make the benefits of AI accessible to all.

GPT-4o mini from OpenAI: The Compact AI Powerhouse Outperforming Competitors

Denis Kondratev

Stream CTO of HR Tech, MTS

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

Groq’s Faster LLaVA, All Hands AI’s $5M Boost, OpenAI’s 1M Users, and YouTube’s Deepfake Detection

Latest In Web3, AI & Emerging Tech

Latest In Web3, AI & Emerging Tech

Unveiling GPT-4o Mini: OpenAI's Game-Changer in AI Technology

o1-preview: OpenAI's New AI Model that can Think & Reason ??

OpenAI's bold moves redefining the future of innovation

The AI Force Awakens

OpenAI Introduces GPT-4o: Everything You need To Know

Bloomberg GPT / GitHub Copilot X / AI Index Report 2023

Exploring OpenAI’s Latest Models: GPT-4, Turbo, o1-Series, and More

领英推荐

Introducing ChatGPT search

2024年10月31日

Llama 3.2: The AI Revolution Goes Local — Privacy, Speed, and Multimodality on Your Device

2024年9月30日

社区洞察

其他会员也浏览了

Groq’s Faster LLaVA, All Hands AI’s $5M Boost, OpenAI’s 1M Users, and YouTube’s Deepfake Detection

Latest In Web3, AI & Emerging Tech

Latest In Web3, AI & Emerging Tech

Unveiling GPT-4o Mini: OpenAI's Game-Changer in AI Technology

o1-preview: OpenAI's New AI Model that can Think & Reason ??

OpenAI's bold moves redefining the future of innovation

The AI Force Awakens

OpenAI Introduces GPT-4o: Everything You need To Know

Bloomberg GPT / GitHub Copilot X / AI Index Report 2023

Exploring OpenAI’s Latest Models: GPT-4, Turbo, o1-Series, and More