#31: It's Raining LLMs.....and soon LVMs too!
Deepak Seth
Actionable and Objective Insights - Data, Analytics and Artificial Intelligence
Generative AI: Back to Basics
Brace yourselves, the AI overlords aren't just coming, they're already here writing our headlines! ?? Just stumbled upon this fascinating read about generative AI - the tech behind ChatGPT. It's like discovering your friendly neighborhood robot has secretly been a superhero all this time. ??♂? From predicting the next word in your email (and occasionally embarrassing us) to potentially designing the next big thing in protein structures (no, not a new protein bar flavor), AI is on a roll! But hey, let's not forget the potential oops moments - biases and ethical dilemmas. Are we ready for this AI-powered rollercoaster? ??
This recent article in the MIT Technology Review revisits the concept of "generative AI," a term that has gained prominence in recent years, particularly with the advent of technologies like OpenAI's ChatGPT. Generative AI differs from traditional machine learning models, which are designed to make predictions based on data. Instead, generative AI focuses on creating new data.
Generative AI systems learn to produce objects similar to the data they are trained on. This includes models like Markov chains, used for tasks like next-word prediction, but lacking in generating plausible text. Modern generative AI models, however, are far more complex and capable, thanks to advances in computational power and algorithms.
Key developments in generative AI include:
Generative AI has a wide range of applications. It can generate synthetic data for training other systems, design novel protein structures, and more. However, it may not be ideal for all data types, particularly structured data like spreadsheets.
Despite its potential, generative AI raises concerns such as worker displacement, inherent biases, and the risk of amplifying hate speech or false statements. On the positive side, it could empower artists and change the economics in various disciplines. Future directions for generative AI include its use in fabrication and developing more intelligent AI agents, drawing on similarities between AI models and human cognitive processes.
Fast Forward to today: Shaping the Future of AI: The Unprecedented Rise of Mini-Giant LLMs
The current trend in the tech industry is likened to a new gold rush, centered around Large Language Models (LLMs).
The key tools for success in this field are open-source code, mirroring the shovels used in a traditional gold rush. This trend is characterized by small, agile startups that are rapidly developing LLMs, challenging the dominance of industry giants.
A prime example is Mistral, a 22-person company founded just six months ago. Mistral has quickly raised USD 415 million at a USD 2 billion valuation. Remarkably, it launched its first LLM within weeks of formation and claims to outperform Llama 2 on most benchmarks.
Similarly, X.AI , an Elon Musk startup founded in July, has made significant strides with a small team and modest funding (under USD 40 million, with an additional USD 1 billion being raised). In less than six months, X.AI launched Grok, which surpasses GPT-3.5 in many benchmarks.
Other notable instances include Anthropic and Baidu. Anthropic, with fewer than 50 team members, developed Claude 2, and Baidu's Ernie was created by a team of less than 100.
These cases illustrate that creating next-generation LLMs no longer requires vast resources, such as an Ivy League PhD or access to major tech company infrastructures. Instead, small, efficient teams are developing competitive LLMs in a matter of weeks—models that rival those which took years and substantial investment to build by their predecessors.
This scenario sends a new message to large corporations worldwide: it may be more advantageous to develop an in-house LLM than to purchase an existing one. Building a proprietary LLM can lead to significant savings, greater control, and prestige. Additionally, it could unlock new AI capabilities.
This is in contrast to what the situation was just a few months ago:
LLMs are not only among the most significant technological innovations but also among the easiest to replicate, promoting widespread democratization. While initially, it seemed that only a select few had the expertise to build foundational models, this skill set is rapidly expanding as more teams successfully develop their products.
Despite challenges related to computing power, expertise, and data quality, the evidence is clear: teams of fewer than 25 people are building billion-parameter models that surpass the likes of GPT-3.5 and Llama 2 in a matter of weeks. Achieving this feat often leads to valuations exceeding a billion dollars.
Scaling Down to Rise Up: The New Frontier in LLM Efficiency
- Growth of Large Language Models (LLMs): Building larger LLMs, like OpenAI's GPT-4 with potentially a trillion parameters, leads to performance improvements but limits accessibility due to high resource requirements.
- Accessibility Challenges: The enormous size and computing requirements of state-of-the-art LLMs restrict their use to well-resourced AI labs, making them inaccessible to most researchers and companies.
- Efforts to Downsize Models: Researchers are focusing on creating smaller LLMs without losing capabilities. Examples include DeepMind's Chinchilla model and Meta's LLaMa, which perform comparably to larger models with fewer parameters.
- Democratization of LLMs: Meta and Stanford researchers are making strides in making LLMs more accessible. Stanford's Alpaca model, built on LLaMa, has been particularly influential in this regard.
- Risks and Limitations: Despite the progress in downsizing, concerns remain about the performance and safety of these smaller models. Alpaca is highlighted as a research model not yet suitable for widespread use.
领英推荐
- New Techniques for Efficiency: Approaches like the "mixture of experts" and exploiting sparsity in models are being explored to improve AI efficiency while overcoming hardware limitations.
- Dual Development Paths: The AI field is witnessing a dual trend: the continuous scaling up of models to achieve top performance and efforts to enhance efficiency and accessibility of smaller models.
Domain Specific LLMs
ChipNeMo aims to explore the applications of large language models (LLMs) for industrial chip design. Instead of directly deploying off-the-shelf commercial or open-source LLMs, we instead adopt the following domain adaptation techniques: custom tokenizers, domain-adaptive continued pretraining, supervised fine-tuning (SFT) with domain-specific instructions, and domain-adapted retrieval models.
The sudden emergence of LLM and Generative AI has triggered a wave of efforts to develop customized, domain-specific LLMs. Bloomberg’s financial LLM (BloombergGPT ) is a good example. Google Cloud outlined Med-PaLM 2, a medically tuned LLM, that's aimed at healthcare. Google Cloud launched a bevy of AI tools at Google Cloud Next for CIOs to digest .
Andrew Ng on the future of AI: Text to Vision and Beyond
Key points:
Overall, Ng paints a picture of a future where AI is not just about scaling bigger models, but also about finding ways to make them more accessible and efficient, and to adapt to different types of data and tasks.
The Perils of AI
2 recent examples:
"???? ??????????????????, ???? ?????? ?? ???????????? ???? ?????????????????? ?????????? ???????? ?????????????? ???????? ?????????? ????????????, ?????????????? ???????????????? ?????? ?????????????????? ???? ???????????????????????? ???????? ?????? ?????????????????????? ???? ???????????????? ????????."
“The C-suite shuffle comes just over a week after a Futurism report revealed that Sports Illustrated and Arena Group-owned finance site TheStreet had been publishing affiliate link-laden commerce articles under the bylines of fake writers.”
Signoff
Once AI was just numbers and chess,
Rule-bound, precise, nothing less.
But now, behold, a creative spree,
Generative AI, wild and free!
Old School cool, you set the stage,
Now Gen AI's the latest rage.
From data strict to dreams that fly,
In AI's tale, we both beautify!
Thanks for tuning into this week's DEEPakAI: AI Demystified newsletter. Stay connected and keep learning! ????
p.s. - The newsletter includes smart prompt based LLM generated content.
CEO & Founder | Gen AI | Health Tech | Ed Tech | Providing Elite Engineers
11 个月Intriguing insights!
Host and Industry Analyst @ CXOTalk
11 个月Excellent and informative observations!
Visual Storyteller and Strategic Sales Professional: Driving Business Success through Compelling Narratives
11 个月The rapid advances in generative AI have sparked excitement and concern. While models like ChatGPT showcase the technology's creative potential, there are worries about biases, job displacement, and ethical risks. But this is an incredibly promising time, but wisdom and care are essential as we write the next AI chapter together. With open and earnest dialogue, a bright future awaits.
Deepak Seth, What's the most exciting trend in Generative AI that you've come across recently?