Small but Mighty AI
77% of enterprise AI usage are using models that are small models, less than 13b parameters.
Databricks, in their annual State of Data + AI report , published this survey which among other interesting findings indicated that large models, those with 100 billion perimeters or more now represent about 15% of implementations.
In August, we asked enterprise buyers What Has Your GPU Done for You Today? They expressed concern with the ROI of using some of the larger models, particularly in production applications.
Pricing from a popular inference provider shows the geometric increase in prices as a function of parameters for a model.1
But there are other reasons aside from cost to use smaller models.
First, their performance has improved markedly with some of the smaller models nearing their big brothers’ success. The delta in cost means smaller models can be run several times to verify like an AI Mechanical Turk.
Second, the latencies of smaller models are half those of the medium sized models & 70% less than the mega models .
Higher latency is an inferior user experience. Users don’t like to wait.
Smaller models represent a significant innovation for enterprises where they can take advantage of similar performance at two orders of magnitude, less expense and half of the latency.
No wonder builders view them as small but mighty.
Note: I’ve abstracted away the additional dimension of mixture of experts models to make the point clearer.
There are different ways of measuring latency, whether it’s time to first token or inter-token latency.
Fractional CFO and advisor for early stage companies, real estate developers, and investors. Former investment banker - M&A/Financing. Home Builder. Passionate about housing development. Silicon Valley native. Mexican.
1 周Hey Tomasz - Big fan of your work ?? A small correction. The Databricks report stated: "across both Llama and Mistral users, 77% choose models that are 13B parameters or smaller." So this included Llama 7B/13B. Your newsletter stated: "77% of enterprise AI usage are using models that are small models, less than 13b parameters." This would imply the 77% are not using 13b, which they are.
AI Consulting Leader @ IBM | Startup Advisor | Dad
1 周Interesting data to substantiate the trend of identifying right-sized model for the job. My one piece of feedback is that most AI architectural design patterns I am seeing COMBINE specialized small language models with large language models. So whereas this article indicates a decision to use a small vs large, its often a question of proportion of inference not either/or.
Keeping the ducks in a row
1 周Interesting. Compute costs are a challenge for LLMs - and a big obstacle for profitability.