登录查看更多内容

Noob looking for guidance on LLM selection

Rajeev Sakhuja

发布日期: 2024年12月15日

Today, I came across a common question on Reddit and shared my response. I'm posting it here to reach a broader audience, as I believe many beginners have similar questions.

Question on r/LLMDev subreddit:

Hello, I'm making a project where every user has 10k input tokens and 400 output tokens worth of interaction at least 200 times a month. The project is for general use(Like general knowledge question, or generating mathematical questions). Basically, it won't be much related to programming so IK Claude isn't the best option.

[Read on Reddit]

My response:

To approach this systematically, we can evaluate the options across three key dimensions:

Dimension 1: Model Complexity

For your use case—handling general knowledge queries and generating mathematical questions—domain-specific expertise isn’t required. Any general-purpose LLM with 7B-13B parameters should suffice. Models like GPT-4 (by OpenAI), or similar alternatives from providers such as Cohere, Anthropic (Claude), or Mistral could work. In general, larger models (e.g., 27B or 70B) often provide higher-quality results but come at increased costs. The question you should be asking is if you REALLY need the best performing model (e.g., 70B)? Let's dive a bit deeper by thinking about quality.

Dimension 2: Quality

Quality depends on your project’s specific needs. If precise and nuanced answers are essential, GPT-4 or Claude might be better choices, but they cost more. If you can tolerate slightly less sophistication, models like Llama 3 (no offense to LLaMMA fans :-) ) or other open source models such as Falcon provide good performance at a lower cost, especially when hosted locally or through cost-efficient APIs.

Bottom line is that while smaller models (7B-13B) are cost-effective, larger models tend to produce higher-quality, nuanced outputs. It’s a good idea to experiment with smaller models first to determine if they meet your quality requirements. They offer the advantages of lower costs and better latency, making them a practical starting point."

Dimension 3: Cost

Cost plays a pivotal role in API/LLM selection.

Let's estimate cost of using different LLM available as a service - based on your requirements:

Input tokens = 10k

领英推荐

TAI 131: OpenAI’s o3 Passes Human Experts; LLMs…

Towards AI 2 个月前

TAI #141: Claude 3.7 Sonnet; Software Dev Focus in…

Towards AI 3 周前

TAI #126; New Gemini, Pixtral, and Qwen 2.5 model…

Towards AI 4 个月前

Output tokens = 400

Number of calls = 200

NOTE:

Do your own price calculation - I don't know about the accuracy of the website I used for generating these comparative pricing.
Multiple cost x (number of users)
Don't forget to factor in the development/QA cost
Self hosted LLMs will require infrastructure which by the way is not cheap :-)

Cost comparison for commercial LLM options

In your particular scenario, the cost is relatively low, so I’d recommend going for the best option. ??

That said, keep in mind that costs for real-world applications can be significantly higher. To provide a complete picture, here are some suggestions for cost optimization

Cost-Optimization Tips

Here are some strategies to reduce costs without compromising too much on quality:

Fine-tune smaller models: Train a smaller model to specialize in your specific queries.
Hybrid approach: Use larger models only for complex queries while leveraging smaller ones for routine tasks.
Context optimization: Use vector databases (e.g., Pinecone) with LangChain to minimize input token usage by feeding only relevant data to the model.

My 2 Cents

If you’re new to the world of LLMs, making these decisions can be daunting. A structured course on LLMs can help you navigate these options more effectively and avoid common pitfalls. If you’re interested, check out my course designed specifically for beginners—it provides actionable guidance and helps you get up to speed quickly.

https://youtu.be/Tl9bxfR-2hk

Govind Moghekar

Founder at SuperAI Labs | Turning AI into Impactful Solutions ?? | Honored to Receive the Prestigious Eleven Labs Grant | Forbes India DGEMS 2024: Select 200 Nominee ??

3 个月

Insightful

Arvind Dayal

Hands-on Technology Leader : ML, AI, Serverless, Cloud, Big Data, SaaS

3 个月

In our recent AI project, we went through a similar exercise and reached the same conclusion. It really boils down to the classic trade-off of Cost, Quality, and Speed—where increasing complexity tends to slow response times and often drives up costs. The key challenge is finding that sweet spot where all these factors align. Ultimately, assessing the ROI of the solution you're building becomes crucial. You need to determine whether the benefits in quality or speed are worth the additional investment. In our case, we discovered that by optimizing our prompts and refining our processes, we were able to significantly improve speed while keeping costs in check. This allowed us to maintain a reasonable level of quality without compromising too much on performance or budget. It's a balancing act: enhance the solution where it matters most while being mindful of the overall impact on efficiency and cost. The ability to adjust and iterate, especially with AI, allows for a more flexible approach that maximizes value without sacrificing the end user experience.

3 次回应

查看更多评论

要查看或添加评论，请登录

Rajeev Sakhuja的更多文章

From Bias to Creativity: Are AI Image Generators Breaking the Mold? ????

2025年2月15日

From Bias to Creativity: Are AI Image Generators Breaking the Mold? ????

This morning, I was experimenting with generative AI image models for research purposes. My exploration led to a…

4 条评论
A fun experiment : You Vs. Reasoning model !!!

2025年2月8日

A fun experiment : You Vs. Reasoning model !!!

In early 2023, shortly after ChatGPT became a household name, I met up with a few friends from my IT circle over…

2 条评论
Saved time + money with Deepseek v3 today !!

2025年2月2日

Saved time + money with Deepseek v3 today !!

Last week, the tech world was buzzing about Deepseek and its implications for the industry. Unless you’ve been living…

1 条评论
Thoughts on measuring intelligence !! while disposing off old computers

2025年1月11日

Thoughts on measuring intelligence !! while disposing off old computers

Last weekend, I finally decided to get rid of my old computers accumulated over the past 20 years. The main reason I…
Top 5 Generative AI enterprise trends for 2025

2025年1月1日

Top 5 Generative AI enterprise trends for 2025

(Raj's post-holiday ramblings on January 1st, 2025 :-) Enterprises stand to gain immense benefits from generative AI…

6 条评论
Xmas morning conversation on LLM Fine-tuning :-)

2024年12月25日

Xmas morning conversation on LLM Fine-tuning :-)

A few months ago, I published an article titled, "LLM Fine-tuning is like training an intern". Last week after reading…
How I Turned AI Magic into Weekend Savings: $300 and 20 Hours, Poof!

2024年11月10日

How I Turned AI Magic into Weekend Savings: $300 and 20 Hours, Poof!

Raj's Sunday morning bliss moment. If you know me, you know I'm frugal—not only do I love saving money, but I'm always…

5 条评论
Should a new grad spend time on learning RAG or Fine-tuning?

2024年11月6日

Should a new grad spend time on learning RAG or Fine-tuning?

Raj's response to a thread on Reddit - putting it out on LinkedIn for wider reach. Hoping that some new grad somewhere…
LLM fine-tuning is like training an intern

2024年11月3日

LLM fine-tuning is like training an intern

Raj's casual conversation with his accountant Recently, I met with my accountant, who shared how he’s been using…

2 条评论
NoSQL databases are for large companies only!!

2023年3月10日

NoSQL databases are for large companies only!!

Recently, I was browsing (/r/databases) on Reddit and found a blog post with the title "NOSQL - a cautionary tale". The…

4 条评论

See all articles

Noob looking for guidance on LLM selection

Rajeev Sakhuja

My response:

Dimension 1: Model Complexity

Dimension 2: Quality

Dimension 3: Cost

领英推荐

Cost-Optimization Tips

My 2 Cents

Rajeev Sakhuja的更多文章

社区洞察

其他会员也浏览了

Multilingual RAG, Algorithmic Thinking, Outlier Detection, and Other Problem-Solving Highlights

OpenAI Launches DALL·E 2 Now Available in Beta with Pricing

Embracing Strict Mode in OpenAI: Revolutionizing Structured Output Generation

?? GraphRAG's Biggest Problem Solved

OpenAI's o1 Outperforms Other LLMs By "Stopping To Think," & More

Artificial Intelligence #207

Artificial Intelligence #207

?? We Need New Benchmarks

Top LLM Papers of the Week (October Week 1, 2024)

LangSmith

My response:

Dimension 1: Model Complexity

Dimension 2: Quality

Dimension 3: Cost

领英推荐

Cost-Optimization Tips

My 2 Cents

Rajeev Sakhuja的更多文章

From Bias to Creativity: Are AI Image Generators Breaking the Mold? ????

A fun experiment : You Vs. Reasoning model !!!

Saved time + money with Deepseek v3 today !!

Thoughts on measuring intelligence !! while disposing off old computers

Top 5 Generative AI enterprise trends for 2025

Xmas morning conversation on LLM Fine-tuning :-)

How I Turned AI Magic into Weekend Savings: $300 and 20 Hours, Poof!

Should a new grad spend time on learning RAG or Fine-tuning?

LLM fine-tuning is like training an intern

NoSQL databases are for large companies only!!

社区洞察

其他会员也浏览了

Multilingual RAG, Algorithmic Thinking, Outlier Detection, and Other Problem-Solving Highlights

OpenAI Launches DALL·E 2 Now Available in Beta with Pricing

Embracing Strict Mode in OpenAI: Revolutionizing Structured Output Generation

?? GraphRAG's Biggest Problem Solved

OpenAI's o1 Outperforms Other LLMs By "Stopping To Think," & More

Artificial Intelligence #207

Artificial Intelligence #207

?? We Need New Benchmarks

Top LLM Papers of the Week (October Week 1, 2024)

LangSmith