The Emerging LLM Value Chain for Enterprise use (part 2)

The Emerging LLM Value Chain for Enterprise use (part 2)

This is a continuation from part 1.


TL;DR

  • Domain adaptation is the bit that is already emerging, and is generic enough that it could be provided by 3rd parties, and will dominated by a few players, not unlike the LLM providers
  • Use case fine-tuning is yet to properly emerge, but will likely again be provided by specialist 3rd parties, with each addressable domain producing its own ecosystem of use case fine tuned model providers
  • OR everything above domain adaptation will be a service offering that enterprises use to instantiate their own bespoke versions of open general models for their needs, based on their own data and documents
  • Ultimately for the power of LLMs to be unleashed into the enterprise, there will have to be a way for them to rapidly and cheaply become useful reflections of the consuming enterprise, but the science to get there is only just emergering


Hand wavy sure, but this is what i see

I'm going to try and speed run this a little bit for a couple of reasons

  1. Whilst all the big news and hype surrounds that bottom layer, generic large generalised models (ChatGPT, Falcon, LLaMA2 etc) aren't super useful out of the box for most enterprise use cases
  2. As we go up the layers, the specificity goes up, but the capital costs go down
  3. Whilst domain adaptation has some strong science emerging day by day (Memorising transformers, contrastive learning techniques like CLIP or SimCLR, and even most excitingly Focused Transformer) this is where todays research boundary layer is.
  4. Therefore the layers of the cake above this are more speculative, but driven by understanding the real world challenges of apply LLMs in enterprise context to solve actual business problems


Domain Adaptation

Something was built with some inputs X, but you want to use it for purpose Y. How do you do it?

This will be the next big leap in LLM use.

The common practice of fine-tuning the model is not only resource-intensive and complex to manage, but it also does not always clearly indicate how to incorporate new knowledge. For example, fine-tuning on a text such as “Alice in Wonderland” does not equip the model to answer questions about the story itself, but rather it trains the model to predict the next token or complete masked sentences

So the somewhat hackneyed aphorisim about LLMs is that they are like a really smart graduate. They seem to know a lot about a whole bunch of things, but their depth on any particular topic is better than the average joe, but not amazing.

There is a reason that new grads are kind of given a few years in junior roles to start to learn the domain expertise to apply the kind of general aptitude their degree is meant to signal. Actual usefulness is often some combination of general aptitude with domain specificity to be able to appreciate, understand, and then solve problems specific to their employer.

So if the big, million/billion/trillion parameter LLMs have ingested the whole internet and are one of these smart grads, how do you give the LLM a masters in Finance? or get them a CII diploma in the London Market?

This is the purpose of domain adaptation and its where an awful lot of the cutting edge research is at right now.

The next piece therefore of the value chain that will emerge will be the domain adaptation element.

There are probably a couple of different forms that this will take:

  1. Classic value chain colonisation where the LLM makers start churning out 'flavours' that have been "fine tuned" or "pre trained" into some specific industry domains
  2. Domain Adaptation as a rent seeking layer - a la Stability AI where someone else puts the work in to make "GPT-X for Y" where the adaptation is a service that you pay a small clip for, in exchange for not having to do it yourself
  3. Domain Adaptation as a service - models are adapted by a service provider who automates this step based on some random combo of model and domain


I think option 1 is dead for the same reason that commercial models like OpenAI etc are dead. However they will present a challenge to entreprenurs in the other layers as they will likely be bundled, free, by the cloud providers, which means any competitive product would have to compete with 'free' and 'already with an MSA', which are bloody tough points of competition


I think 2 will be a the major transitory state in a world where everything is new, complexity is high, and rate of change is rapid. There is clearly value in the service, someone has to do it, and the market at this layer is probably sufficient that a couple of players will merge to serve the most easily addressable domains


I think 3 is the longer term answer, as we become more familiar with the strengths and weakness of different generative ai models and approaches, as the science of domain adaptation starts to gravitate around a couple of key concepts, and spinning up an LLM for a given task becomes something as easy as spinning up an e2-standard or an m3.large.

You will pick the base model, the domain, and the resulting domain adapted model will be spat out and ready for inferencing.


Use case fine-tuning


...will have to wait, i need to go to bed


#llm #generativeai #ai


要查看或添加评论,请登录

Chris Mullan的更多文章

社区洞察

其他会员也浏览了