A Guide: Choosing The Perfect Language Model For Your Use Case
The landscape of Large Language Models (LLMs) is rapidly changing with many models emerging, but which model is suitable for our specific task?
This project will guide you in choosing an LLM for your specific needs using criteria and public resources.
TL;DR
LLM Usage Types
Among the 5 deployment types, we will focus on the common approaches particularly relevant to app development: Scope 1 and Scope 3:
Key Factors to Consider
Once we have defined the task scope, here are 3 key factors to consider when choosing a model.
1. Task Performance
Key areas to consider for successful task completion are here:
Using technical benchmarks allows us to quantify the task performance.
Major metrics by task are here:
< Coding Tasks >
< Chatbot Assistance >
< Reasoning Tasks >
< Question Answering and Language Understanding >
2. Computational Efficiency
Assess if the model can run efficiently while aligning with our environment's capabilities to avoid performance bottlenecks.
3. Commercial Terms
Choose a model that fits our long-term business goals & technical evolvement.
Leveraging Public Resources
Evaluating LLMs from scratch can be a time-intensive process. Fortunately, there are valuable public resources available to streamline your selection process:
Leaderboard & Metrics Comparison
Commercial Terms
领英推荐
Credit: Philipp Schmid / Providers covered:
Broader Search Based on Technical Specs and Task Objectives
Example: Build an Auto-coding App
Let's explore how to choose the right LLM for building apps, using auto-coding applications as a practical example.
Step 1. Define task and performance metrics
Step 2. Choose a model using public resources
Refer to the HuggingFace leaderboard focusing on Python coding tasks and choose CodeLlama 13B;
Step 3. Deployment & Result
Use an inference API to deploy an app:
Result:
Conclusion
Future of LLMs - Niche Domination or Universal Powerhouse..?
The guideline enables us to choose an LLM that excels at our target tasks while aligning with our resource limitations and commercial requirements. We can focus on our needs over the general popularity of the model, leveraging public resources.
On the other hand, the rapid advancement of AI technology suggests a shift towards dominance by a smaller number of highly competitive models - either universally or domain-specifically. To navigate the dynamic LLM landscape, we need to:
In the next article, we will explore LLMs' architecture, encoder & decoder, and deploy a chatbot.
Reference:
Hugging Face CodeLlama 13b Python (Model Card)