登录查看更多内容

A special invitation ??: Evaluating LLMs for Your Applications talk with Google GenAI product leader

Shyvee Shi

Product @ Microsoft | A forward-thinking product leader combining creativity, user psychology, and AI to drive growth and scale communities | ex-LinkedIn

发布日期: 2024年7月2日

Having worked in the ML/AI field for over 20 years, including at leading tech companies like Google GenAI, Meta, Microsoft and AWS AI teams, MAHESH YADAV 've witnessed firsthand the transformative power of Large Language Models (LLMs) in product development. With the proliferation of models like Gemini 1.5 Pro, Llama 3, GPT-4-Turbo, and over 100 others, selecting the right LLM for your GenAI application is a critical decision that can make or break your project's success.

Selecting the right cloud provider can be challenging, and the same is true for choosing a large language model (LLM). Prompt techniques vary from one model to another, so if you switch models, you’ll need to re-run all your tests from scratch. Therefore, it's beneficial to invest more effort upfront in choosing the right model to avoid accumulating technical debt in the future.

Mahesh has made this choice across open source (pie vs Llama 3) and Gemini or GPT4/ Claude 3 many times while advising startups. Mahesh will share his expertise via a comprehensive talk on choosing and evaluating LLMs on Maven. This session is designed to equip product managers and AI builders with the knowledge and tools needed to select LLM or SLM (small language models) based on their needs.

In this talk, Mahesh'll cover three essential areas:

A framework for model selection, taking into account crucial factors such as budget constraints, latency requirements, privacy and team capabilities.
Strategies for establishing clear, actionable evaluation criteria, leveraging industry benchmarks to save model evaluation cost.
A practical walkthrough using a contract processing application, demonstrating how to align model selection with specific business requirements and performance benchmarks.

By the end of the session, attendees will walk away with a structured approach to selecting and evaluating GenAI models, enabling them to make informed decisions. This talk will be especially useful for product leaders who need to gain insights into developing effective LLM evaluation strategies, and AI builders who need the right tools and practical insights to apply these principles directly to their projects.

领英推荐

From GPT-4 to Microsoft 365 Copilot

Habibur Rahman 1 年前

How Meta and Microsoft are Democratizing Generative AI…

AlifCloud IT Consulting Pvt. Ltd. 1 年前

The Death of SaaS: How Large Language Models Are…

Jo?o Fernandes 5 个月前

?? RSVP here

Mahesh’s Bio

MAHESH YADAV is a Product Leader at Google GenAI team. Mahesh is one of the world's top AI executives and an award-winning AI Product Educator. His work on AI has been featured in the Nvidia GTC conference, Microsoft Build, and Meta blogs.

Mahesh has 20 years of experience in building products at Meta, Microsoft and AWS AI teams. Mahesh has worked in all layers of the AI stack from AI chips to LLM and has a deep understanding of how GenAI companies ship value to customers.

Currently, he leads an AI agent for Google cloud support team. He uses Gemini latest models from deepmind and multi agent framework with knowledge graph to automate support agent function for Google cloud customers.

?? RSVP here

A special invitation ??: Evaluating LLMs for Your Applications talk with Google GenAI product leader

Shyvee Shi

Product @ Microsoft | A forward-thinking product leader combining creativity, user psychology, and AI to drive growth and scale communities | ex-LinkedIn

领英推荐

PM Learning Series

66,041 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

S.D.I. English Edition: Which infrastructure for generative AI ?

OpenAI's bold moves redefining the future of innovation

Latest AI, Crypto News Headlines for June 8, 2023

Latest AI, Crypto News Headlines for August 1, 2023

How to create your Copilot with Azure AI Studio in 30 minutes

Llama 3 Goes for the Gold, OpenAI Tests SearchGPT Prototype, Apple Commits to AI Safety ... and more

The Rising Demand for Prompt Engineers: Unlocking the Power of AI!

Unlock Massive Savings: How LLM Routing Can Cut Your AI Costs by Up to 90%

DIY GPT-J deployment: a guide for those inspired by OpenAI's ChatGPT

Platform Risk in AI: Strategies to Mitigate Dependency on Large Language Models for Core Technology

领英推荐

PM Learning Series

66,041 位关注者

?? Win Free AI Courses Worth $1,500 - My Thanksgiving Gift to Help You Become an AI-Powered Professional!

2024年11月27日

How to Become an AI-Powered Professional Before 2024 Ends

2024年11月19日

AI’s Transformative Impact on News and Media with Atlantic CEO, Nicholas Thompson

2024年7月10日

AI's Transformative Impact on News and Media & Evaluating LLMs for Your Applications - You're Invited!

2024年6月27日

5 Ways to Leverage Proximity to Shape Tomorrow's Business, Society, and Careers

2024年6月19日

Build hyper-prioritized, customer-focused roadmaps: a conversation with Praful Chavda, Founder of Chisel

2024年6月12日

Join conversations about better collaboration between product and market teams & Top 10 AI Hot Takes

2024年4月10日

The Generative AI Product Development Process

2024年4月3日

Why Relationships are King (or Queen) to Promotions and What Minorities Often Get Wrong

2024年3月20日

5 Practical Frameworks for New Leaders From Product Manager to People Manager

2024年3月13日

社区洞察

其他会员也浏览了

S.D.I. English Edition: Which infrastructure for generative AI ?

OpenAI's bold moves redefining the future of innovation

Latest AI, Crypto News Headlines for June 8, 2023

Latest AI, Crypto News Headlines for August 1, 2023

How to create your Copilot with Azure AI Studio in 30 minutes

Llama 3 Goes for the Gold, OpenAI Tests SearchGPT Prototype, Apple Commits to AI Safety ... and more

The Rising Demand for Prompt Engineers: Unlocking the Power of AI!

Unlock Massive Savings: How LLM Routing Can Cut Your AI Costs by Up to 90%

DIY GPT-J deployment: a guide for those inspired by OpenAI's ChatGPT

Platform Risk in AI: Strategies to Mitigate Dependency on Large Language Models for Core Technology