登录查看更多内容

?? Breaking Down LoRA vs. DoRA: Which Fine-Tuning Technique Reigns Supreme?

Nikhil Deka

?? Pursuing Data Science & AI @ IIT Guwahati | Machine Learning | Deep Learning | Generative AI | Cloud & Big Data |?? Campus Ambassador @GeeksforGeeks |

发布日期: 2025年1月29日

If you’re working with large language models (LLMs), you’ve likely heard of LoRA (Low-Rank Adaptation) for efficient fine-tuning. But have you met its evolved counterpart, DoRA (Weight-Decomposed Low-Rank Adaptation)? Let’s dive into why DoRA is making waves!

?? The Core Difference: Weight Decomposition

LoRA directly adjusts a model’s weight matrix (W) by adding low-rank updates. DoRA takes this further by decomposing W into magnitude (m) and direction (V_c) components:

W’ = W + m × (V_c / ||V_c||)

By separating magnitude and direction, DoRA enables independent, precise control over how much(magnitude) and where (direction) the model adapts during fine-tuning.

Image:

?? Why Does This Matter?

1?? Fine-Tuning Control: DoRA’s decomposition allows nuanced updates. Adjusting magnitude and direction independently leads to better convergence and higher performance, especially on smaller datasets.

2?? Performance Gains: Studies show DoRA often outperforms LoRA in accuracy and stability, with minimal computational overhead.

3?? Inference Efficiency: Like LoRA, DoRA adds almost zero latency during inference. Win-win!

?? Key Takeaways:

LoRA: Simple, effective, but combines magnitude/direction updates.
DoRA: Adds decomposition for sharper control → better results with similar efficiency.
Formula Matters: W’ = W + m × (V_c / ||V_c||) unlocks smarter adaptation.

??????Personally I haven't implemented DoRA yet , I am reading the Research Paper . It include Complex Mathematics (Obviously I cannot understand completely those core math skills??) . But I got the overall idea .

要查看或添加评论，请登录

Nikhil Deka的更多文章

?? Understanding Knowledge Distillation: A Key to Efficient AI Models

2025年3月10日

?? Understanding Knowledge Distillation: A Key to Efficient AI Models

In deep learning, model performance often comes at the cost of high computational and memory requirements. Knowledge…
?? AutoGen vs. Phidata: A Comparison of Two Leading Agentic AI Frameworks

2025年2月17日

?? AutoGen vs. Phidata: A Comparison of Two Leading Agentic AI Frameworks

1. AutoGen Developed by: Microsoft Research Focus: Multi-agent collaboration and automated workflows.
Inception : introduced by Googlenet

2025年2月13日

Inception : introduced by Googlenet

All of you might have heard about the movie Inception, right? But have you ever heard about Inception in Computer…
?? Unlocking the Power of LLMs: An Advanced RAG with AWS! ??

2025年2月7日

?? Unlocking the Power of LLMs: An Advanced RAG with AWS! ??

After countless hours of debugging, optimizing, and integrating cutting-edge technologies, I’m thrilled to share my…
?? Exploring Multimodal Capabilities: The Role of Poppler and Tesseract in Document Processing

2025年2月4日

?? Exploring Multimodal Capabilities: The Role of Poppler and Tesseract in Document Processing

In the evolving landscape of AI, multimodal capabilities—integrating text, image, and structured data—are crucial for…
?? Amazon Sales Analysis and Visualization: Turning Data into Actionable Insights ??

2025年1月21日

?? Amazon Sales Analysis and Visualization: Turning Data into Actionable Insights ??

After Exploring Snowflake and tableau , I want to deal with a production ready Project. I’m excited to share a recent…
AWS Redshift vs. Snowflake: A Comprehensive Comparison with Tableau Integration

2025年1月19日

AWS Redshift vs. Snowflake: A Comprehensive Comparison with Tableau Integration

Cloud-based data warehousing has revolutionized the way organizations manage and analyze vast datasets. AWS Redshift…

2 条评论
?? Time-Series Forecasting on Web Traffic ????

2025年1月11日

?? Time-Series Forecasting on Web Traffic ????

Excited to share my latest project, where I performed time-series analysis and built predictive models to forecast web…
??Start the New Year with GeeksforGeeks: A 90-Day Educational Adventure?

2025年1月5日

??Start the New Year with GeeksforGeeks: A 90-Day Educational Adventure?

As we step into 2025, GeeksforGeeks invites you to embark on a transformative learning journey designed to bring you…

1 条评论
?? Understanding Cache Writing Policies: The Key to Lightning-Fast Systems

2024年12月26日

?? Understanding Cache Writing Policies: The Key to Lightning-Fast Systems

Imagine running a massive e-commerce platform with millions of users checking product prices and inventory every…

See all articles

?? Breaking Down LoRA vs. DoRA: Which Fine-Tuning Technique Reigns Supreme?

Nikhil Deka

?? Pursuing Data Science & AI @ IIT Guwahati | Machine Learning | Deep Learning | Generative AI | Cloud & Big Data |?? Campus Ambassador @GeeksforGeeks |

?? The Core Difference: Weight Decomposition

?? Why Does This Matter?

?? Key Takeaways:

Nikhil Deka的更多文章

社区洞察

其他会员也浏览了

Artificial Intelligence #156

Papers Explained 1: Mistral 7B

Artificial Intelligence #45: The significance of Probabilistic Graphical Models

How Machines Understand Language: An Introduction to Word Embeddings

Gemini 2.5 - The big throw makes Goolge big ?

???? Harnessing LLMs for Financial Data Insights ????

In Every Great Product There's a Bit of Magic

Unpacking DeepSeek-V3: Technical Innovations and the Question of Cost Transparency

Answering Reasoning LLM's Ghost Questions

Yes LLMs Build World Models, or, The Unreasonable Effectiveness of Gradient Descent

?? The Core Difference: Weight Decomposition

?? Why Does This Matter?

?? Key Takeaways:

Nikhil Deka的更多文章

?? Understanding Knowledge Distillation: A Key to Efficient AI Models

?? AutoGen vs. Phidata: A Comparison of Two Leading Agentic AI Frameworks

Inception : introduced by Googlenet

?? Unlocking the Power of LLMs: An Advanced RAG with AWS! ??

?? Exploring Multimodal Capabilities: The Role of Poppler and Tesseract in Document Processing

?? Amazon Sales Analysis and Visualization: Turning Data into Actionable Insights ??

AWS Redshift vs. Snowflake: A Comprehensive Comparison with Tableau Integration

?? Time-Series Forecasting on Web Traffic ????

??Start the New Year with GeeksforGeeks: A 90-Day Educational Adventure?

?? Understanding Cache Writing Policies: The Key to Lightning-Fast Systems

社区洞察

其他会员也浏览了

Artificial Intelligence #156

Papers Explained 1: Mistral 7B

Artificial Intelligence #45: The significance of Probabilistic Graphical Models

How Machines Understand Language: An Introduction to Word Embeddings

Gemini 2.5 - The big throw makes Goolge big ?

???? Harnessing LLMs for Financial Data Insights ????

In Every Great Product There's a Bit of Magic

Unpacking DeepSeek-V3: Technical Innovations and the Question of Cost Transparency

Answering Reasoning LLM's Ghost Questions

Yes LLMs Build World Models, or, The Unreasonable Effectiveness of Gradient Descent