登录查看更多内容

Code Generation LLM's and how businesses are using these for improving productivity

Nikhil Goel

AI | Machine Learning | AI SAAS B2C Platform Leader

发布日期: 2025年3月16日

Today's article in on a very unique topic of Code Generation LLM and how businesses are using these LLMs for improving productivity.

As part of this article, I will cover 1) What are code generation LLMs 2) What is the need for these LLMs and 3) How businesses are using them to improve productivity - I will cover some use case here where code generation LLM can be used.

So, let's dive in.

What are code generation LLMs.

Code generation LLM's are LLMs which generate programming language code. Code generation LLM can do code syntax checking, address Fill-in-middle in code, improve or optimize already existing code (written by programmers) and finally answer questions on code.

But how does Code Generation LLMs can do all the above.

Answer lies in the way LLM are trained on code.

Fundamentally two types of LLM training on code is there.

Reasoning
Non-Reasoning

Reasoning LLM's for code are DeepSeek R1. Non-Reasoning LLM's for code are Qwen2.5B, StarCoder, LLAMA 3 fine-tuned on code, Mistral Codestral to name a few.

Code Generation LLM are Causal LLM.

Let's cover what data LLM are trained on and also a standard training pipeline of how LLM's are trained on code.

LLMs are trained on code using following data/tokens.

Github Code Snippets across 90-100 odd programming languages such as Python, C++, SQL etc.
Github Comments and Documents
Webscrapped Tokens relevant for code
Text

Code generation LLMs have very long context.

Pre-training objectives for such code generation LLM is:

Next Token Prediction
Fill-In-Middle (FIM)

The training pipeline for a Code Generation looks like as below:

New LLM Pre-training and Post-training Paradigms

How does one code generation LLM differs from others is in two major aspects:

Data
Context Window - Longer the better for code generation

Having covered what code generation LLMs are let's cover second part of the article - Why do we need code generation LLMs

Code generation LLMs can majorly help in:

Code Optimization for any programming language. This is otherwise a major time taking tasks
Fill-In-Middle - Fill missing code between the start and end of code.
Code Generation from scratch
Unit Test Code Generation
Code Syntax Enforcement

Let's now move to cover the final part of the article - How are businesses using Code Generation LLM to increase productivity.

Think from engineering code perspective, think from cost perspective to develop code and think from efficiency perspective of the code generated - This is where code generation LLM help to reduce feature releases TIME TO MARKET and GIVE BUSINESSES A COMP EDGE AT LOW COST.

Think from an Enterprise Platform Companies, B2C Platforms, D2C Platforms and think engineering, how much time/money /effort goes to develop code. You have your answer.

One important point though - Current Code Generation LLMs are good, and they are getting better however still log way to go.

In summary my assumption is that within a year from now code generation LLM will generate code which is equivalent to human generated code in terms of quality.

Thanks All. Hope you had a good read.

Disclaimer: Opinion / Views expressed above are the author's personal and has no bearing or affiliation to the authors current employer or any earlier/past employers.

Credit:

https://blog.fabrichq.ai/large-language-models-for-code-generation-f95f93fe7de4

https://magazine.sebastianraschka.com/p/new-llm-pre-training-and-post-training

Image Credit:

https://blog.fabrichq.ai/large-language-models-for-code-generation-f95f93fe7de4

https://magazine.sebastianraschka.com/p/new-llm-pre-training-and-post-training

要查看或添加评论，请登录

Nikhil Goel的更多文章

Qwen2.5B Coder LLM and how transformative is for Business

2025年3月9日

Qwen2.5B Coder LLM and how transformative is for Business

Today's post is on Qwen2.5B Code LLM and how Qwen2.
AI Agents(smolagents) and how these are transforming business

2025年2月23日

AI Agents(smolagents) and how these are transforming business

Today's article is on AI Agents and specifically on "smolagents" and how agents are transforming businesses. As part of…
DeepSeekMath - LLM for mathematical reasoning and how it transforms Mathematical Problem-Solving

2025年2月16日

DeepSeekMath - LLM for mathematical reasoning and how it transforms Mathematical Problem-Solving

Today's article about DeepSeekMath an LLM that is pushing the limits of mathematical reasoning and how this aspect is…
deepseek R1 LLM - What is and why this LLM is game changer for business

2025年2月2日

deepseek R1 LLM - What is and why this LLM is game changer for business

Today's article is on deepseek R1, what is deepseek R1 - the model architecture, how the model was trained (RLHF) and…
AI Agents What are AI Agents and why AI Agents are transformative to business

2025年1月19日

AI Agents What are AI Agents and why AI Agents are transformative to business

Today I am going to write about AI Agents, what are AI Agents and why AI Agents are transformative to business. We are…

2 条评论
"Text2SQL" how LLM's enable this and why this is transformative for Businesses

2025年1月12日

"Text2SQL" how LLM's enable this and why this is transformative for Businesses

Today I am going to write about "Text2SQL", what exactly is text2sql is, why this is needed and how this is hugely…
Self-Instruct and Instruction Tuning of LLM and applying to solve a business use case

2024年12月29日

Self-Instruct and Instruction Tuning of LLM and applying to solve a business use case

Today I am going to write what Self-Instruct is, importance of Self-Instruct, what is Instruction tuning and how…

4 条评论
Retrieval Augmented Generation (RAG) vs Fine Tuning of LLMs - What is right for business

2024年12月15日

Retrieval Augmented Generation (RAG) vs Fine Tuning of LLMs - What is right for business

Today's article is about comparing RAG Vs Fine Tuning of LLM's, and what of these two is more apt for businesses if the…

1 条评论
Contextual RAG - What it is and the value over simple RAG

2024年11月24日

Contextual RAG - What it is and the value over simple RAG

Today's article is about Contextual RAG, what it is and the value over simple RAG. In the article I am going to cover…
How Large Language Models reason and business benefits of LLM Reasoning

2024年11月17日

How Large Language Models reason and business benefits of LLM Reasoning

Today's article is on how LLM's can be used for reasoning and what are the business benefits of reasoning. In the…

See all articles

Nikhil Goel的更多文章

Qwen2.5B Coder LLM and how transformative is for Business

AI Agents(smolagents) and how these are transforming business

DeepSeekMath - LLM for mathematical reasoning and how it transforms Mathematical Problem-Solving

deepseek R1 LLM - What is and why this LLM is game changer for business

AI Agents What are AI Agents and why AI Agents are transformative to business

"Text2SQL" how LLM's enable this and why this is transformative for Businesses

Self-Instruct and Instruction Tuning of LLM and applying to solve a business use case

Retrieval Augmented Generation (RAG) vs Fine Tuning of LLMs - What is right for business

Contextual RAG - What it is and the value over simple RAG

How Large Language Models reason and business benefits of LLM Reasoning

社区洞察