GPT-3 fired in ML / AI arms race…
In an otherwise fairly uneventful year so far, I'm guessing most of you would have heard or read about the significant development announced by OpenAI team in late May, and made available through API later.
By any means, GPT-3 is a significant milestone. However, every ML capability or model advancement announcement by big tech brings bring forth a chain of thoughts around increasing (and rapidly accelerating) gap between FAAMG’s / Big tech vs. the rest of the Fortune 500 enterprises in their ability to build & deploy ML at scale.
A few examples where an average consumer (and Fortune 500 exec) experiences advanced ML applications 00’s of times a day:
- Face / Object recognition (mobile)
- Sentence Completion (search / email)
- Conversation (voice assistants)
- Shopping / what to watch recommendations (ecommerce / social media / entertainment)
- …
Add to that the high octane marketing around potentially wonderful benefits of ML (or AI depending on who you’re talking to) & its not difficult to imagine that this heightens expectations which business leaders have from ML / AL. Most understand maturity curves and capability evolution but their expectations around bespoke, custom built ML solutions to transform their functions or companies are still rising faster than Tesla stock.
A standard Fortune 500 enterprise ML usecase will typically go through all the considerations of cost benefit analysis, labeled data availability, compute infrastructure, data scientists time, production systems deployment, change management, etc. If stars are aligned and the model performs well in production, voila, we’ve solved one usecase!
Contrast that with what now is arguably the most powerful language model in the universe from a OpenAI. GPT-3 NLP model is pre-trained on a measly 175Bn parameters! It can perform (a.k.a. score) a variety of related tasks (as in multiple usecases, many of which it may not have been trained, for e.g., answering SAT type of questions, generating tweets, translating languages, converting simple English sentences to legalese,…). The authors note, for few shots setting, on many of these tasks, the GPT-3 model even beats a finely tuned model (i.e. which has been custom built for a specific usecase/family of usecases and fed a lot of training data). The performance for zero shot or single shot setting is also decent, if not spectacular (when compared to a human). It can even (gulp!) do creative functions (across languages) like writing poetry as well as routine tasks like summarizing articles, writing new tweets or writing thoughtful business memos.
The paper is a fun (even for non-PhDs) & somewhat scary read.
Sam Altman (ex YC President & co-founder of OpenAI) has predicted that NLP will be one of the strongest areas of R&D in AI in next decade. So, what sorcery will GPT-4 bring is beyond imagination.
The breadth and depth of applications being handled by enterprise ML programs is not even comparable to Big Tech.
Its not hard to imagine the reasons:
- Culture (Digital first vs looking to transform digitally)
- Talent
- Executive leadership
- R&D investment
Lets look at #4 at one more level of detail (very coarse assumptions here but accuracy is not the point here)
- Very roughly speaking each of FAAMG’s invest anywhere from $5 to $15Bn annually in R&D (article). Broadly, adjusting for growth, etc., it adds up to $60Bn annually. Say half of it goes to ML ~ $30Bn
- All VC investments in US based companies was roughly $120 Bn in 2019 (article). Say 25% of it is spent on ML R&D (early stage, silicon valley skew, etc.). ~ $30Bn
- Average investment in ML by a Fortune 500 company in ML is, say $50Mn per year (again, we can dive as deep as we want on this topic but taking very high level assumptions for discussion sake). ~25Bn
- Exclude all corporate venture, M&A investments, etc.
So, total annual spend on ML by a handful of companies is in the same order of magnitude as the aggregated annual spend by all Fortune 500 cos, or by all VC funded cos in a year!
This trend has persisted for last few years (and accelerated in ’20 if anything) and, most believe, is a fundamental driver of increasing % of economic value being captured by big tech.
While Big tech deserves the credit for most advancements in this domain, the widening canyon will create a risk that many of these orgs will slip into a valley of disappointment. That usually results in data science team being restructuring into newer operating models, talent disengagement, re-prioritization of data science pipeline and ultimately loss of enthusiasm in the leadership (which is the biggest blow). Till such time arises that internal changes (i.e. new leadership) or external stimuli (i.e. an acquisition, a consulting firm’s study or a competitor action) disrupts the steady state thinking and ML programs are re-energized at scale, till the cycle repeats itself.
Another trend which I see accelerating is Big tech offering MLaaS (Amazon Comprehend for e.g. in Healthcare industry). They are acquiring the last remaining key ingredients of training ML successfully (i.e. industry data sans privacy issues and domain knowledge). Can a traditional enterprise machine learning model beat an off the shelf model trained on 100x more data and refreshed 100 times more frequently? I guess very few enterprises (the largest ones in each industry) would have models which remain useful (or economical to build and maintain) once Big Tech verticalizes its ML horsepower.
This post is an acknowledgement (not an alarm or kudos for anyone) of the gap, which continues to accelerate. What implications this will have on a variety of aspects, for e.g. path which ML practitioners should take in their careers, is itself a very broad & interesting topic for discussion separately.
Principal Consultant at Principal Financial Group
3 年Awesome read Sir. You have hit all the points from the middle of the bat. Thanks for sharing.
Software Engineer >> Entrepreneur >> Product Manager
4 年Nicely written!!