The emerging LLM Value Chain for Enterprise use (part 1)
All art made from Text to image generation

The emerging LLM Value Chain for Enterprise use (part 1)

(Whats already here and what has yet to emerge)

Part 2 of of an N part brain dump on LLMs, Part 1 here


TL;DR

Base Models: Classic Capital intensive structures emerging

Verticalised Models: Players emerging, still lots to play for

Domain Adaptation: Open Playing field, science emerging


Generalised Base Model Creators


Model creation is very very expensive, at least it is if you want to place well on the hugging face leaderboard. For example the previous leading light of the open LLM world, Falcon 40B took around 2700 petaFLOP-days (75% that of GPT-3) to train. That ain't cheap!

The 2 big inputs to LLM creation are training data and compute time. Access to both is what limits market entrants.

What we see in other value chains is that capital intensive tasks drive a market structure of a small number of players able to commit the capital (in this case compute power rather than straight cash). Therefore it is likely that the market will split between a few big generalised models offered essentially for free and a world of specialised/verticalised LLMs that.


Today

Closed: OpenAI/Microsoft (GPT-3/4 etc), Google (Bard, Minerva etc), Meta (old LLaMa) StabilityAI, Anthropic (Claude), DeepMind

Open: Meta (LLaMa2), EleutherAI (GPT-J/Neo etc), Technology Innovation Institute (Falcon), Hugging Face (BLOOM)

Closed

The Cloud providers have incentive to create and market their own flavours of LLM as a means of strengthening their ecosystem and increasing the forces of lock in, also frankly they have easy access to the scale of compute resources required. So expect AWS, GCP, MS, etc etc to continue to build and improve their models ad infinitum.

Their major adoption roadblock is trust and licensing. I know of so many firms who have access to a "safe secure, no data share" Azure instance of OpenAI but still wont touch it with a 10ft pole. I've heard whispers that maybe the actual licence wording on data being shared back to MS isn't quite as tight as the PR makes it out. Either way, people aren't rushing to use it.

Either way expect the big cloud providers to pump out these models for a while.

Open

Meta has made an interesting play in moving from closed to open, with LLaMA 1 being a prop model without a commercially usable licence, but then putting the much improved LLaMA2 into the market with a very permissive licence. A company the size of Meta of course has the compute resource to apply to this task, the more interesting question is why the opened it.

My best guess is that they want to own the ecosystem, having realised that monetising (generalised) LLMs is probably not going to happen on the scale that MS/OpenAI presumes.

The FOSS community has been fighting the good fight in trying to crowdsource both the data and the compute needed to keep truly open LLM options on the table, with non profits like Eluther and TII churning out fairly large and sophisticated models that have kept good pace with the commercial offerings.

However real world experience using them shows that they lack some of the polish that (until LLaMA2) only the prop models had, because they had to use freely available corpuses, corpusi? corporae? of data they lacked the "Chat" like qualities of ChatGPT, Claude, Bard and the like. The chat like behaviour needs millions of examples of Question and Answer Pairs, and banks of data like that don't exist freely on the internet (yet).

The (likely) Future

Closed: A number of Big Cloud prop models (maybe some of them go the LLaMA 2 route and open up to drive adoption and ecosystem)

If these closed models don't move to an open licence, their use in the enterprise will be forever limited, no-one is building business critical or money making processes on models that someone else owns, operates and learns from. Firms will only build LLMs into their businesses where the data stays within their cloud estate.

When people describe a deal as "Strategic" it means it doesn't make sense on the number alone, and i can't see a way where the MS / OpenAI deal makes sense in terms of capturing enterprise spend on LLMs.


Open: Llama2 (and its successors?) plus 1 FOSS solution where a community hub like HuggingFace manages to aggregate the demand and co-ordinate the creation.


In-between: StabilityAI's value prop is taking Open models and making them more enterprise friendly, and in exchange charging a small rent for that service. I can see this being an attractive middle ground where the benefits of taking an open model and keeping the data in house are mixed with the benefits of enterprise stability. Taking a small slice of enterprise LLM spend everywhere is a smart move.


Verticalised / Specialist LLMs Creators

This is probably the type of LLM that actually has the capacity to be properly monetised, as compared to big generalised chat like LLMs.

Verticalised LLMs face similar challenges to the above in terms of compute, but the real edge is in access to the type of data that makes the LLM specialist.

If for years we've been told data is the new oil, this is one of the instances where market leaders or market aggregators have the best shot at winning this race. However the race isn't to build some kind of GAI wizard like omnibrain, but to bring the kind of human-computer interaction layer that LLMs bring, but to much more specialised subfields

I'm going to make a distinction here between domain adaptation and truly specialised models here, because these are fundamentally different tasks.


Examples Today: 彭博资讯 's BloombergGPT, Bud 's Jas Finance focused LLM, Google Cloud 's Minerva LLM focused on science and maths, Harvey 's Lawyer AI etc etc

If the big open models are going to destroy whole layers of knowledge work and remove the need for some of the jobs and services of today, its the specialist models that will have more surgical impact on high tier knowledge work.

When I talk about personal leverage and organisations moving from pyramids to obelisks, these type of models make this the most likely, impactful, and destructive.

If you think of a typical business team, where there is some loosely triangular structure, it works that way so that the person with the experience/knowledge/judgement makes the decisions/ sets the direction etc, and then others in the team effectuate this through actions.

In a world where a Bloomberg LLM can turn the request "What were the major headlines from X's earning call today and how does their outlook compare to their peers" into a fairly decent report in a few seconds, why the hell would you hire anyone other than a few senior experienced people who can ask the right questions?

If you have Jas, why the hell do you need to pay for an IFA?

If you have Harvey, do you need a bunch of paralegals or junior associates?

The same pattern repeats.

Because LLMs are multiplicative leverage, the more you know, the more you can get out of them, and less valuable junior or less experienced people are.

If the previous career defence to the rise of outsourcing, automation, and AI, was up-skill, get specialist, get some deep technical knowledge, i think that has a limited lifespan going forward.

The (likely) Future

Market Leaders, Market Aggregators, or Market Operators who have access to sufficient data to build and populate LLMs for specific use cases will win and be able to monetise their very specific role in a way the generalists will not.

The big barriers to adoption will be around oversight, audibility, regulation and responsible AI.


I was going to keep going and talk about Domain Adaptation, and move up the value chain from the models, but I'm supposed to be on holiday and this has already taken up too much of my morning. Jerad Leigh this is why its N part!




Jerad Leigh

Supercede: The Reinsurance Platform

1 年

This is becoming a compelling saga! Key pieces that jump out to me: 1. Importance of trust -- Microsoft has a huge lead here and has done well in its positioning to sell to enterprises in whatever guise that might take. Meta might still have a shot at winning the SME/SMB group, but the erosion of trust in the Meta brand over the past five years will be very damaging in the LLM race. 2. If we move from pyramids to obelisks (which I agree is likely to happen and an analogy I'll shamelessly steal), what is the impact on the future of the workforce? Will we create a world with limited potential for younger generations to learn 'the right questions to ask', ultimately resulting in a destructive knowledge gap when the experts retire?

要查看或添加评论,请登录

Chris Mullan的更多文章

社区洞察

其他会员也浏览了