Do Trillions Of Parameters Help In LLM Effectiveness?
Venkat Ramakrishnan
Chief Quality Officer | Software Testing Technologist | Keynote Speaker | Corporate Storyteller
"The more, the merrier" - A great saying to reflect on while organizing a party. Does the same apply for the number of parameters in a large language model (LLM) in increasing its effectiveness?
The Number Of Parameters Game
The more parameters there are, the more connections can be made between the parameters, the goal being more meaningful associations in the parameters that would lead to the right answer. When I see release of LLMs that market for accuracy based on the no. of parameters, I tend to be become skeptical. 'Our LLM has a trillion parameters!' an LLM release note would say, not mentioning further if that trillion would be really effective in increasing LLM's accuracy in answers.
These parameters operate on data. If the underlying data is inaccurate, irrespective of the number of parameters, the output would be wrong. It all depends on what we train the neural network with. If we train the LLM with say The Internet, chances are that the LLM answers would not only be incorrect, but inconsistent too (meaning if I ask the same question tomorrow, I will get a different answer!).
Whereas, if I train the LLM on a focused domain like say car manufacturing, where I have control over what data I feed to the LLM, there is a much higher possibility of getting the right answer, and here, increasing the number of parameters help because the associations between the parameters are developed well as we have more parameters.
But that is only to a certain extent! There is a threshold above which increasing the number of parameters do not lead to more accurate results irrespective of how correct your data is! More parameters thereafter will only lead to overfitting - your model will not work for future data that is not represented already in the training data!
领英推荐
Software Testing and Quality Angle
What does this mean to a Software Testing of a Quality person while they test an LLM that has curated data (not the entire Internet) ?
Conclusion
Blindly increasing the number of parameters do not lead to LLM effectiveness and quality. We need to be conscious of data quality, the task that we are focused on, performance criteria, and cost effectiveness. An optimal balance is achieved by thoroughly testing the accuracy against the number of parameters.
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
8 个月In assessing the impact of trillions of parameters on LLM effectiveness, historical data suggests that while larger models can enhance performance in certain tasks, they also come with trade-offs in terms of computational resources and scalability. However, testing a focused data-based LLM offers insights into optimizing the balance between effectiveness, performance, and cost. Considering advancements in parallel processing and distributed computing, have you explored techniques like model distillation or pruning to mitigate resource constraints while maintaining model efficacy?