DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Vlad Bogolin

AI Researcher and Engineer

发布日期: 2024年6月21日

Today's paper introduces DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. The model is further pre-trained from DeepSeek-V2 with an additional 6 trillion tokens, significantly enhancing its coding and mathematical reasoning capabilities while maintaining strong general language performance. DeepSeek-Coder-V2 expands support to 338 programming languages and extends the context length to 128K tokens.

Method Overview

DeepSeek-Coder-V2 is built upon the foundation of DeepSeek-V2 and undergoes additional pre-training with a carefully curated dataset. The pre-training corpus consists of 60% source code (1,170B tokens), 10% math content (221B tokens), and 30% natural language data. This diverse dataset exposes the model to a wide range of programming languages and mathematical concepts.

The training process involves two main phases: pre-training and alignment. During pre-training, the model is exposed to the multi-source corpus, allowing it to learn patterns and structures from code, math, and general language data. This phase significantly enhances the model's coding and mathematical reasoning abilities.

In the alignment phase, the model undergoes fine-tuning using an instruction dataset that includes code, math, and general instruction data. This is followed by reinforcement learning using the Group Relative Policy Optimization (GRPO) algorithm. The GRPO process aligns the model's behavior with human preferences, particularly in coding tasks. Preference data is collected using compiler feedback and test cases, and a reward model guides the policy model's training.

To support code completion functionality, the model also incorporates the Fill-In-Middle approach during fine-tuning. This allows DeepSeek-Coder-V2 to effectively complete partial code snippets.

The model's architecture is based on a Mixture-of-Experts (MoE) design, which allows for efficient scaling and specialized knowledge across different domains. The context length is extended to 128K tokens, enabling the model to handle more complex and extensive coding tasks.

Results

DeepSeek-Coder-V2 demonstrates impressive performance across various code and math benchmarks:

Achieves 90.2% accuracy on HumanEval, outperforming GPT4-Turbo (88.2%) and other closed-source models for Python.
Scores 75.3% on MBPP+, surpassing GPT4-Turbo (72.2%) and other competitors.

Pingping Xiu 1 年前

DSPy: A New Framework - Program Your Foundation…

Clarifai 6 个月前

Code Wars: Open-Source vs. Private AI Models—Are We…

Nassir J. 1 年前

Performs exceptionally well on GSM8K with 94.9% accuracy, slightly below GPT4-Turbo (95.0%) but ahead of other models

Excels in practical coding benchmarks like Aider (73.7%) and LiveCodeBench (43.4%) outperforming most other models.

The model also maintains comparable performance to DeepSeek-V2 in general language tasks, showcasing its versatility.

Conclusion

DeepSeek-Coder-V2 represents a significant advancement in open-source code language models, achieving performance comparable to or exceeding closed-source models like GPT4-Turbo in various code and math benchmarks. By combining a diverse pre-training corpus, extended context length, and advanced alignment techniques, the model demonstrates strong capabilities in coding, mathematical reasoning, and general language understanding. This work narrows the gap between open-source and closed-source models in the field of code generation. For more information please consult the?full paper.

Congrats to the authors for their work!

Zhu, Qihao, et al. "DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence." arXiv preprint arXiv:2406.11931 (2024).

AI Paper of the Day

815 位关注者

要查看或添加评论，请登录

Vlad Bogolin的更多文章

NVLM: Open Frontier-Class Multimodal LLMs

2024年9月18日

NVLM: Open Frontier-Class Multimodal LLMs

Today's paper introduces NVLM 1.0, a family of frontier-class multimodal large language models (LLMs) that achieve…
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

2024年9月17日

Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

Today's paper introduces Windows Agent Arena, a benchmark for evaluating multi-modal agents within the Windows…
InstantDrag: Improving Interactivity in Drag-based Image Editing

2024年9月16日

InstantDrag: Improving Interactivity in Drag-based Image Editing

Today's paper introduces InstantDrag, a new approach for interactive drag-based image editing. The method enables users…
UI-JEPA: Towards Active Perception of User Intent through Onscreen User Activity

2024年9月15日

UI-JEPA: Towards Active Perception of User Intent through Onscreen User Activity

Today's paper introduces UI-JEPA, a new framework for understanding user intent from on-screen activities on mobile…
LLaMA-Omni: Seamless Speech Interaction with Large Language Models

2024年9月14日

LLaMA-Omni: Seamless Speech Interaction with Large Language Models

Today's paper introduces LLaMA-Omni, a new model architecture for low-latency and high-quality speech interaction with…
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

2024年9月13日

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

Today's paper presents a large-scale study comparing the research idea generation capabilities of large language models…
OpenAI o1 System Card

2024年9月12日

OpenAI o1 System Card

Today's paper introduces OpenAI's new o1 model series, which uses large-scale reinforcement learning to perform…
SongCreator: Lyrics-based Universal Song Generation

2024年9月11日

SongCreator: Lyrics-based Universal Song Generation

Today's paper introduces SongCreator, an new system for generating complete songs with vocals and accompaniment from…
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance

2024年9月10日

Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance

Today's paper introduces Paper Copilot, a large language model (LLM) system designed to provide personalized academic…
Pron vs Prompt: Can Large Language Models already Challenge a World-Class Fiction Author at Creative Text Writing?

2024年9月9日

Pron vs Prompt: Can Large Language Models already Challenge a World-Class Fiction Author at Creative Text Writing?

Today's paper presents a creative writing contest between GPT-4, a state-of-the-art language model, and Patricio Pron…

See all articles

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Vlad Bogolin

AI Researcher and Engineer

Method Overview

Results

领英推荐

Conclusion

AI Paper of the Day

815 位关注者

Vlad Bogolin的更多文章

社区洞察

其他会员也浏览了

The first releases of Code LLM - Code Intelligence Breakdown | Multi-Program Synthesis

Generative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT and other LLMs

Understanding the Importance of Tokenization in Language Models

Interesting Content in AI, Software, Business, and Tech- 5/31/2023

Tech Languages - Low-Level, Middle-Level and High-Level

Programming Languages For AI & ML

LoRA Learns Less and Forgets Less

Future of Programming

GPT-4 - a new drunk senior developer in your team

NuminaMath 7B TIR: A New Era in AI-Powered Mathematical Problem-Solving

Method Overview

Results

领英推荐

Conclusion

AI Paper of the Day

815 位关注者

Vlad Bogolin的更多文章

NVLM: Open Frontier-Class Multimodal LLMs

Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

InstantDrag: Improving Interactivity in Drag-based Image Editing

UI-JEPA: Towards Active Perception of User Intent through Onscreen User Activity

LLaMA-Omni: Seamless Speech Interaction with Large Language Models

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

OpenAI o1 System Card

SongCreator: Lyrics-based Universal Song Generation

Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance

Pron vs Prompt: Can Large Language Models already Challenge a World-Class Fiction Author at Creative Text Writing?

社区洞察

其他会员也浏览了

The first releases of Code LLM - Code Intelligence Breakdown | Multi-Program Synthesis

Generative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT and other LLMs

Understanding the Importance of Tokenization in Language Models

Interesting Content in AI, Software, Business, and Tech- 5/31/2023

Tech Languages - Low-Level, Middle-Level and High-Level

Programming Languages For AI & ML

LoRA Learns Less and Forgets Less

Future of Programming

GPT-4 - a new drunk senior developer in your team

NuminaMath 7B TIR: A New Era in AI-Powered Mathematical Problem-Solving