Leverage AI effectively with DatologyAI curated data tailored to your specific business needs.??
There's been lots of discussion about whether DeepSeek's $5.5M number is correct. Their math is very simple and checks out. Many have pointed out that this number doesn't include the cost of purchasing GPUs, salaries, R&D etc. But this was all very explicitly stated in the paper -- the $5.5M number covered the cost of compute for the training run alone. However, that's still a huge deal! Many have been arguing that frontier models will soon cost 100s of millions to billions in training costs alone. DeepSeek's ability to do it far more efficiently demonstrates that this is patently false. All of that R&D will be commoditized into easy-to-use solutions for training models (this is our explicit goal at DatologyAI -- make it so that you don't need to be an expert in order to train a model on your own data with the best possible data curation). This means that in a few years, an enterprise that wants to develop their own incredibly powerful and specialized small model for whatever use case their business requires, will be able to do so end-to-end for a few million dollars at most in marginal cost. Jevons Paradox has become surprisingly popular over the last week and it's because it perfectly applies here. If training costs $100s of millions to billions, very few entrants can work on it. But in a world where training costs a few hundred thousand to a few million, it will massively change the landscape. This will be especially important as inference costs become the main driving factor of cost in model development and deployment. In the enterprise, small specialized models that don't have the general ability of frontier models, but which can perform the single task that they need to five 9s of reliability and which can be deployed for a fraction of the cost because they have far fewer parameters than general models.