Overpromise or overfitting?

Overpromise or overfitting?

Being on the front line of many innovation conversations, I often run into an impasse with decision-makers who believe in our capabilities but ultimately choose a different company due to their greater headcount and perceived reliability.

Indeed, any large solution provider with thousands, if not hundreds, of thousands of employees, will necessarily have a more diverse pool of experts to draw from. And a more established brand, in theory, should be more resilient to weather through market volatility or resource availability.? Perhaps the crowning point, the frosting on top, is the promise of deep experience in solving similar kinds of problems the client wants to solve.?

So, is there no advantage in engaging a smaller group of innovators? My epiphanies came from having the opportunity to work with both large language models (LLM) and small language models (SLM). It is not always obvious that we can extrapolate what we learn from one discipline to another. Once in a while, though, patterns emerge across different modalities to reveal a more profound commonality. To that end, I want to explore deeper truths of what makes innovation successful through lessons I learned from data science.

For the most of us, LLM does not require any introduction. The popularity of ChatGPT, Claude, and others like them cannot be understated. There is a trend that suggests that more data can lead to a better model.

Whenever a new paradigm shift occurs, we tend to fall into the "If the only tool you have is a hammer, you tend to see every problem as a nail" mode. More recently, we've encountered several scenarios where LLM does not necessarily yield the more optimal solutions. While these anti-patterns are just now observed in generative AI applications, I've been challenged by the same mentality in innovation management for more than 20 years.

More complexity doesn't mean better.

If the number of parameters in the model is significantly larger than the number of training examples, the model can easily memorize the training data instead of learning general patterns. The result is a model that performs well in controlled environments but falters when faced with real-world diversity and unpredictability. This is called overfitting a model and it is more frequently observed in LLM than SLM.

When we view vendor engagement through a similar lens, it's easy to see that large vendors often come with multiple layers of management and technical protocols that may not be relevant to the project at hand. In our experience, vendors beholden to a narrow technology stack often require additional unnecessary workarounds to achieve simple objectives.

Further, complexity is often celebrated to justify the scope of work. Instead of shedding light on the solution's true efficacy, complexity is often worn like a badge of honor. Unfortunately, these bloated requirements without validation can ultimately fail to capture the users' true needs.

Dollars are not the only thing that can be high cost

Compared to SLM, implementing LLM in a solution may require many times the energy costs due to increased computational requirements. To put things into perspective, developing a model like GPT-4 from scratch involves a substantial expenditure of tens of millions of dollars whereas a small language model ranges between 1.6 to 2 million.

It's important to note that LLM can often offer superior performance and versatility, but the choice between models should be based on the task's specific requirements, available resources, and the tradeoff between cost and performance.

What is the tradeoff for using a larger vendor on a specialized project? A larger solution provider tends to see small projects as a way to diversify their market. The priority for the engagement is often to provide low-risk training to upskill internal staff. Of course, a small project to a large vendor does not mean small to the client. Without an alignment on the value generation, the dollar investment will not correlate with the value delivered. With innovations being developed at a breakneck pace, this type of misstep can lead to an irreversible opportunity cost.

Diminished Interpretability

Smaller models require less time to train and allow for more iterations and adjustments to the training process. In contrast, because of their vast size, architecture, and intricate interactions within their layers, LLMs can create a "black box" effect that makes it challenging to understand how they make decisions.

Unsurprisingly, the same "black box" effect is often the death knell for enterprise projects. Every innovation requires iterations. This process is critical not only in understanding the true value proposition but also in educating and onboarding new users.

While large vendors have a greater pool of resources to draw from, they also need to follow standardized procedures and pre-established solutions to streamline operations. The standardization requires some rigidity that can hinder the ability to tailor solutions to an organization's unique needs and drastically inhibit the project's ability to iterate.

Whether intentional or not, inflexibility can further exacerbate an enterprise environment that may already be slower to implement changes and respond to feedback due to its size and bureaucratic processes.

To avoid the pitfalls, organizations can adopt several mitigation strategies directly developed from data science:

  1. Assess Specific Needs: Just as it's crucial to understand the data requirements for training an effective generative AI model, organizations should thoroughly assess their needs and project requirements before choosing a vendor. This ensures that the selected solution aligns closely with their goals.
  2. Pilot Projects and Prototyping: Cross-validation is a method that tests a model on different subsets of the data to ensure it performs well on untested data. Before committing to a large initiative, consider running pilot projects or developing prototypes that allow organizations to validate solutions without significant upfront investment.
  3. Hybrid Approaches: Ensemble learning combines multiple models to improve performance, leveraging their strengths. Similarly, a hybrid approach that combines the robust institutional knowledge of internal resources with specialized experts can create a more robust and tailored solution.
  4. Regular Reviews and Adjustments: Continuously review the performance of the chosen vendor and the solution being developed. Be prepared to adjust based on feedback and changing needs, like fine-tuning a model to improve its generalization.

The introduction of generative AI is a great equalizer. As solution providers, big or small, we find ourselves in a new frontier where expertise and usage patterns are still rapidly invented. In machine learning, selecting a simpler, more specialized algorithm can sometimes outperform a complex one on specific tasks.

The goal of this writing isn't to draw ire from my peers. There are clearly strong use cases for both LLM and large vendors. However, I invite anyone interested in AI to explore options with expertise tailored to your particular project. After all, you may find a more efficient and effective solution for a trimmer fit.

Monagin Detablan

From VA to Operations Leader | Problem Solver | EOS Fan | Helping Businesses Run Smoothly

2 个月

Thought-provoking perspective, Rick! Sometimes less really is more especially when it comes to avoiding overfitting or overpromising in AI. Great read! ??

回复

Such a thought-provoking question! Reminds me of the idea - simplicity often leads to greater clarity and innovation. As a team, we value thoughtful reflection. ???

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了