???April 7x7

???April 7x7

When light-weighting AI models for on-device implementation and production-grade use, ensuring model accuracy post-compression is vital. But successful compression results may not necessarily translate into concomitant model performance in a production environment. This is because model accuracy can change when compiled for hardware deployment.

When light-weighting AI models for production-grade quality, what considerations should be given so that model performance post-compression stays consistent even in a production environment?


??Updates

CLIKA has been accepted into the Google for Startups AI First Accelerator! ??

? Read more about it here!


“Recently, there has been a shift toward greater openness, particularly regarding the carbon costs of training AI models. However, disclosure of the environmental costs associated with inference—a potentially more significant concern—remains insufficient” - AI Index Report 2024, Stanford University

AI Trends

High-performing models with relatively fewer parameters are here! But can they run on your local device within a compute budget?

Late this month, Meta released a new version of the Llama model: Llama 3. Available in two different sizes (8B, 70B), it comes with not only a new extended tokenizer but also a commercially permissive license.

What’s impressive about Llama 3 is that the 70B model significantly outperforms GPT-3.5 (score: 70) in the MMLU benchmark despite having 2.5 times fewer parameters (based on the total parameter count), while also consistently outperforming other state-of-the-art models within their respective parameter ranges.

But to run these models on your personal device, you would still need to quantize the weights and activations to a lower precision to reduce memory requirements. But quantizing without sacrificing model performance is quite a challenge.

??See how CLIKA automatically compresses models for resource-constrained environments without compromising performance.


? See what else is up in this space:

  1. [Unite.AI] Top 10 Takeaways from Stanford's 2024 AI Index Report
  2. [Meta] Introducing Meta Llama 3: The most capable openly available LLM to date
  3. [Mistral AI] Cheaper, Better, Faster, Stronger
  4. [Business Standard] Qualcomm unveils Snapdragon X Plus chip for PCs with on-device AI: Details
  5. [VentureBeat] Apple releases OpenELM: small, open source AI models designed to run on-device
  6. [Intel] Intel Builds World’s Largest Neuromorphic System to Enable More Sustainable AI
  7. [ML Commons] Announcing MLCommons AI Safety v0.5 Proof of Concept


??Food for Thought: Responsible AI & Benchmarks

Pursuing responsible AI at all times is imperative for creating a sustainable, integral and ethical AI ecosystem. At the core of this pursuit should be a commitment to ensuring the integrity of benchmark results, as inflating or fabricating them has real-world impact with implications for decision-making, fairness, and trust in AI systems. This should be a collective, shared responsibility across all industries utilizing or creating AI in both public and private sectors.

CJ (Chan Jong) Na

Co-Founder, CTO. Ex Cruise

10 个月

Super! Curious to know if openELM is part of topic

回复

要查看或添加评论,请登录

CLIKA Inc.的更多文章

社区洞察

其他会员也浏览了