#31: It's Raining LLMs.....and soon LVMs too!

#31: It's Raining LLMs.....and soon LVMs too!

Generative AI: Back to Basics

Brace yourselves, the AI overlords aren't just coming, they're already here writing our headlines! ?? Just stumbled upon this fascinating read about generative AI - the tech behind ChatGPT. It's like discovering your friendly neighborhood robot has secretly been a superhero all this time. ??♂? From predicting the next word in your email (and occasionally embarrassing us) to potentially designing the next big thing in protein structures (no, not a new protein bar flavor), AI is on a roll! But hey, let's not forget the potential oops moments - biases and ethical dilemmas. Are we ready for this AI-powered rollercoaster? ??

This recent article in the MIT Technology Review revisits the concept of "generative AI," a term that has gained prominence in recent years, particularly with the advent of technologies like OpenAI's ChatGPT. Generative AI differs from traditional machine learning models, which are designed to make predictions based on data. Instead, generative AI focuses on creating new data.

Generative AI systems learn to produce objects similar to the data they are trained on. This includes models like Markov chains, used for tasks like next-word prediction, but lacking in generating plausible text. Modern generative AI models, however, are far more complex and capable, thanks to advances in computational power and algorithms.

Key developments in generative AI include:

  • Generative Adversarial Networks (GANs): Introduced in 2014, they consist of two models working together to generate realistic outputs.
  • Diffusion models: Developed for creating realistic images and introduced a year after GANs.
  • Transformer architecture: Introduced by Google in 2017, it underpins large language models like ChatGPT.

Generative AI has a wide range of applications. It can generate synthetic data for training other systems, design novel protein structures, and more. However, it may not be ideal for all data types, particularly structured data like spreadsheets.

Despite its potential, generative AI raises concerns such as worker displacement, inherent biases, and the risk of amplifying hate speech or false statements. On the positive side, it could empower artists and change the economics in various disciplines. Future directions for generative AI include its use in fabrication and developing more intelligent AI agents, drawing on similarities between AI models and human cognitive processes.

Fast Forward to today: Shaping the Future of AI: The Unprecedented Rise of Mini-Giant LLMs

The current trend in the tech industry is likened to a new gold rush, centered around Large Language Models (LLMs).

The key tools for success in this field are open-source code, mirroring the shovels used in a traditional gold rush. This trend is characterized by small, agile startups that are rapidly developing LLMs, challenging the dominance of industry giants.

A prime example is Mistral, a 22-person company founded just six months ago. Mistral has quickly raised USD 415 million at a USD 2 billion valuation. Remarkably, it launched its first LLM within weeks of formation and claims to outperform Llama 2 on most benchmarks.

Similarly, X.AI , an Elon Musk startup founded in July, has made significant strides with a small team and modest funding (under USD 40 million, with an additional USD 1 billion being raised). In less than six months, X.AI launched Grok, which surpasses GPT-3.5 in many benchmarks.

Other notable instances include Anthropic and Baidu. Anthropic, with fewer than 50 team members, developed Claude 2, and Baidu's Ernie was created by a team of less than 100.

These cases illustrate that creating next-generation LLMs no longer requires vast resources, such as an Ivy League PhD or access to major tech company infrastructures. Instead, small, efficient teams are developing competitive LLMs in a matter of weeks—models that rival those which took years and substantial investment to build by their predecessors.

This scenario sends a new message to large corporations worldwide: it may be more advantageous to develop an in-house LLM than to purchase an existing one. Building a proprietary LLM can lead to significant savings, greater control, and prestige. Additionally, it could unlock new AI capabilities.

This is in contrast to what the situation was just a few months ago:

LLMs are not only among the most significant technological innovations but also among the easiest to replicate, promoting widespread democratization. While initially, it seemed that only a select few had the expertise to build foundational models, this skill set is rapidly expanding as more teams successfully develop their products.

Despite challenges related to computing power, expertise, and data quality, the evidence is clear: teams of fewer than 25 people are building billion-parameter models that surpass the likes of GPT-3.5 and Llama 2 in a matter of weeks. Achieving this feat often leads to valuations exceeding a billion dollars.

Scaling Down to Rise Up: The New Frontier in LLM Efficiency

- Growth of Large Language Models (LLMs): Building larger LLMs, like OpenAI's GPT-4 with potentially a trillion parameters, leads to performance improvements but limits accessibility due to high resource requirements.

- Accessibility Challenges: The enormous size and computing requirements of state-of-the-art LLMs restrict their use to well-resourced AI labs, making them inaccessible to most researchers and companies.

- Efforts to Downsize Models: Researchers are focusing on creating smaller LLMs without losing capabilities. Examples include DeepMind's Chinchilla model and Meta's LLaMa, which perform comparably to larger models with fewer parameters.

- Democratization of LLMs: Meta and Stanford researchers are making strides in making LLMs more accessible. Stanford's Alpaca model, built on LLaMa, has been particularly influential in this regard.

- Risks and Limitations: Despite the progress in downsizing, concerns remain about the performance and safety of these smaller models. Alpaca is highlighted as a research model not yet suitable for widespread use.

- New Techniques for Efficiency: Approaches like the "mixture of experts" and exploiting sparsity in models are being explored to improve AI efficiency while overcoming hardware limitations.

- Dual Development Paths: The AI field is witnessing a dual trend: the continuous scaling up of models to achieve top performance and efforts to enhance efficiency and accessibility of smaller models.

Domain Specific LLMs

ChipNeMo aims to explore the applications of large language models (LLMs) for industrial chip design. Instead of directly deploying off-the-shelf commercial or open-source LLMs, we instead adopt the following domain adaptation techniques: custom tokenizers, domain-adaptive continued pretraining, supervised fine-tuning (SFT) with domain-specific instructions, and domain-adapted retrieval models.

The sudden emergence of LLM and Generative AI has triggered a wave of efforts to develop customized, domain-specific LLMs. Bloomberg’s financial LLM (BloombergGPT ) is a good example. Google Cloud outlined Med-PaLM 2, a medically tuned LLM, that's aimed at healthcare. Google Cloud launched a bevy of AI tools at Google Cloud Next for CIOs to digest .

Andrew Ng on the future of AI: Text to Vision and Beyond

Key points:

  • Visual prompting: Ng showcased a technique for using AI to recognize objects in images by "drawing" on them with a mouse pointer. He sees this as the future of computer vision, similar to the text revolution driven by large language models (LLMs).
  • Large vision models (LVMs): Ng expects LVMs to follow a similar trajectory as LLMs, achieving breakthroughs through large-scale training and transformer networks. However, challenges remain in understanding image structure and dealing with video data.

  • Data for training LVMs: Unlike vision AI traditionally relying on labeled data, LVMs may be trained on unlabeled data with techniques like hiding parts of images and filling them in. Synthetic data could be another option, but not yet cost-effective for massive models.
  • Transformers for all AI?: Ng doesn't believe transformers will dominate all forms of AI. While they excel at processing unstructured data like text, structured data like spreadsheets will require different approaches.
  • Scaling LLMs: Bigger isn't always better. Ng suggests a sweet spot for model size depending on the task, with smaller models running on devices like laptops. He foresees a future with more applications running on the edge instead of relying on cloud-based giants.
  • Transformer architecture: The future of transformer architecture is uncertain. While it's been around for six years, Ng hopes for further evolution but acknowledges the potential for it to reach maturity. He draws comparisons to biological brains and the longevity of the x86 architecture to illustrate that even "good enough" solutions can last.

Overall, Ng paints a picture of a future where AI is not just about scaling bigger models, but also about finding ways to make them more accessible and efficient, and to adapt to different types of data and tasks.

The Perils of AI

2 recent examples:

  • Like many others, I was very impressed by a promotional video touting the multi-modal capabilities of recently announced Google Gemini.I even included a write up about it in the last edition of my newsletter, apparently though, "????????????’?? ???????? ???????????? ???????? ?????? ??????????" (says TechCrunch )

"???? ??????????????????, ???? ?????? ?? ???????????? ???? ?????????????????? ?????????? ???????? ?????????????? ???????? ?????????? ????????????, ?????????????? ???????????????? ?????? ?????????????????? ???? ???????????????????????? ???????? ?????? ?????????????????????? ???? ???????????????? ????????."

  • AI gets Sports Illustrated into an unsportsmanlike conduct maelstrom!

“The C-suite shuffle comes just over a week after a Futurism report revealed that Sports Illustrated and Arena Group-owned finance site TheStreet had been publishing affiliate link-laden commerce articles under the bylines of fake writers.”

Signoff

Once AI was just numbers and chess,

Rule-bound, precise, nothing less.

But now, behold, a creative spree,

Generative AI, wild and free!

Old School cool, you set the stage,

Now Gen AI's the latest rage.

From data strict to dreams that fly,

In AI's tale, we both beautify!

Thanks for tuning into this week's DEEPakAI: AI Demystified newsletter. Stay connected and keep learning! ????

p.s. - The newsletter includes smart prompt based LLM generated content.

Javeed Muhammad

CEO & Founder | Gen AI | Health Tech | Ed Tech | Providing Elite Engineers

11 个月

Intriguing insights!

Michael Krigsman

Host and Industry Analyst @ CXOTalk

11 个月

Excellent and informative observations!

Sanjeev B.

Visual Storyteller and Strategic Sales Professional: Driving Business Success through Compelling Narratives

11 个月

The rapid advances in generative AI have sparked excitement and concern. While models like ChatGPT showcase the technology's creative potential, there are worries about biases, job displacement, and ethical risks. But this is an incredibly promising time, but wisdom and care are essential as we write the next AI chapter together. With open and earnest dialogue, a bright future awaits.

Deepak Seth, What's the most exciting trend in Generative AI that you've come across recently?

要查看或添加评论,请登录

社区洞察

其他会员也浏览了