#5-Bohemian RhapsAI: ReplayzIQ's AI Sales Coaching Revolution, GTM AI Tools Library Launch, Claude 3.0 vs ChatGPT vs Gemini, and Navigating AI
Jonathan M K.
Head of GTM Growth Momentum | Founder GTM AI Academy & Cofounder AI Business Network | Business impact > Learning Tools | Proud Dad of Twins
Every time I think nothing crazy will happen to write about in 7 days, something inevitably happens, which I am excited to tell/show you!
As always, this newsletter and podcast is sponsored by GTM AI Academy which we are about education, training, and AI Enablement for GTM teams, individuals, and leaders.
As a thank you from me for being a part of this journey, if you want to join in the Academy, you can get 30% off any bundle or course using the code: podcast30 over on https://www.gtmaiacademy.com/courses Right now, we have Generative AI Foundations course, which is a great place to start for most of you who are new to AI and then Enablement AI Powered which is specifically for L&D, Enablement, or even Revops people who want to dive into AI workflows specifically for sales enablement, revenue enablement, etc.
Today we will be going over:
Lets do this...
You can hear the podcast on Youtube, Apple, Spotify, or here on Linkedin as well as a whole other plethora of locations.
I wanted to share some of the key insights and takeaways from my fascinating discussion with Dave from ReplayzIQ. We really dove deep into how they're utilizing AI to transform sales coaching and enablement. Here are the highlights:
Key Takeaways:
The discussion really underscored the huge potential of AI to personalize and scale sales coaching in ways that drive meaningful improvements in performance. As Dave aptly put it, "I think sales reps deserve more support out there. And this is, one easy and quick way to do that."
All in all, I was struck by how thoughtful Dave and the ReplayzIQ team are being in their approach - focusing on building robust AI models trained on quality data, validating results, and truly aiming to empower sales reps and leaders. It's not about replacing humans but equipping them with powerful tools to succeed.
I'd love to hear your thoughts! What stood out to you from the conversation?
ITS TIME!
My friends it is time... The GTM AI Tools Demo Library is now live.
You will need to register for it, but just requires an email and you are in!
So over the last several months, I have been demo'ing tools both for necessity for my team and curiosity because I am loving all the developments.
I always wanted to have a central place where I could see demos because as much as I love marketing videos, (I am leading a content marketing team after all) there is a difference between professionally produced marketing videos and a real demo.
So I have split them out into the best categories I could think of in the course.
I hope you will find value in these demos and see what would be helpful for you.
With permission from each team, I have done the following:
Some of the tools are:
SellMeThisPen Momentum.io GTM Buddy Scalenut Wordly Replicate Labs bizPROFI Second Nature RNMKRS Copy.ai Regie.ai Truebase Ubique Kipsy Grain.ai CloseStrong.ai Quantified Spiky.AI Swyft AI Glyphic Uniphore WINN.AI Tough Customer AI and more
If you want to have your tool featured in the library, you must provide a demo like video that is more real and is like a conversation talking to me, like on a Loom that I can download. Sending me a marketing video that already exists on your website will not cut it, I want real, pretty please ;) Just message me.
Claude 3 vs ChatGPT vs Gemini
Anthropic who owns Claude just released their new Claude 3.0 model and let me tell you, it is impressive.
You may be asking, why would I use each of these tools over the other? Good question, but before we jump into that, lets talk about the updates to Claude 3 and why it matters to you as a GTM pro. Now there are all these items in the above list, let me break those down first:
First Shots and CoT (Chain of Thought)
The concept of 0-shot, 3-shot, 5-shot, 8-shot, and 10-shot learning in AI can be explained using a cooking analogy. Just as a chef's ability to create a dish improves with more guidance and examples, an AI model's performance enhances as it is provided with an increasing number of examples or "shots" to learn from.
Starting from 0-shot learning, where the model relies solely on its pre-existing knowledge, to 10-shot learning, where it has access to a wealth of examples, the AI's capacity to understand and generalize to new tasks grows, much like a chef's skills improve from being a novice to mastering the culinary arts.
Now What are all these tests?
Alright, let's dive into the nitty-gritty of these AI benchmarks and tests that everyone's been talking about.
MMLU (Massive Multitask Language Understanding)
First up, we've got the MMLU test. This bad boy is designed to put an AI model through the wringer, testing its understanding across a wide range of subjects you'd typically study in undergrad. We're talking everything from the artsy stuff like literature and philosophy to the hardcore sciences like physics and math. The goal? To see if the AI can match the breadth and depth of knowledge of your average college student.
GPQA (Graduate Program Questions Assessment) and Diamond
Next on the list, we've got the GPQA and Diamond tests. These are like the MMLU's older, wiser siblings. They're all about testing an AI's ability to reason at the level you'd expect from someone in a graduate program. We're not just talking about regurgitating facts here; these tests involve complex questions that require the AI to apply, analyze, and synthesize information like a boss. It's like putting the AI through a virtual Master's degree!
GSM8K (Grade School Math 8K)
But let's not forget about the basics. The GSM8K test is here to make sure our AI friends haven't forgotten their arithmetic. This test throws a bunch of math problems at the AI, ranging from simple addition to the kind of stuff you'd see in middle school pre-algebra. It's like a little refresher course to keep the AI's math skills sharp.
MATH Dataset
Speaking of math, the MATH dataset is the next level up. This one challenges the AI with high school to early college-level mathematics problems. We're talking algebra, calculus, statistics - the whole shebang. The goal here is to see if the AI can not just crunch numbers but really understand the underlying mathematical principles.
MGSM (Multilingual Grade School Math)
But what if we want to test an AI's math skills in different languages? That's where the MGSM test comes in. It's just like the GSM8K but with a linguistic twist. The AI has to solve math problems presented in multiple languages, showcasing its versatility across different tongues.
HUMANEVAL
Now, let's talk about the HUMANEVAL test. This one's for all the coding whizzes out there. HUMANEVAL puts the AI's programming chops to the test, challenging it to write snippets of code that perform specific tasks. It's like a job interview for AI, seeing if it can walk the walk when it comes to coding.
DROP (Discrete Reasoning Over the content of Paragraphs) and F1 SCORE
Next up, we've got the DROP test and the F1 SCORE. These are all about testing an AI's ability to reason over text. The DROP test focuses on discrete reasoning, like calculating dates or interpreting numerical data embedded in sentences. The F1 SCORE is used to evaluate how well the AI does on tasks like DROP, balancing precision and recall. Think of it as a reading comprehension test on steroids.
BIG-BENCH-HARD
If you really want to push an AI to its limits, you bring out the BIG-BENCH-HARD. This is a collection of tasks designed to test the boundaries of what AI can do, with challenges ranging from easy to "good luck with that." It's the ultimate benchmark for seeing where AI excels and where it needs to hit the books.
领英推荐
HELLASWAG
Last but not least, we've got the HELLASWAG test. This one's all about common sense and general knowledge. The AI has to complete sentences or predict what comes next in a scenario, showing off its understanding of everyday concepts and logical sequences. It's like testing if the AI could survive a dinner party conversation without embarrassing itself.
Now when comparing Gemini vs Claude vs ChatGPT, the following was found:
1. The Apple Test
In this classic reasoning test, Claude 3 Opus managed to hold its own and answer correctly, but only after it was given little pep talk with a system prompt. GPT-4 and Gemini 1.5 Pro? They aced it without breaking a sweat.
2. Calculate the Time
This one's a tricky test designed to separate the smart from the... well, not so smart. And unfortunately, Claude 3 Opus fell into the latter category, along with Gemini 1.5 Pro. GPT-4 had a bit of an identity crisis, sometimes getting it right, sometimes not.
3. Evaluate the Weight
When asked to compare the weight of a kilo of feathers and a pound of steel, Claude 3 Opus got it wrong, while GPT-4 and Gemini 1.5 Pro nailed it.
4. Solve a Maths Problem
The problem: If x and y are the tens digit and the units digit, respectively, of the product 725,278 * 67,066, what is the value of x + y. Can you explain the easiest solution without calculating the whole number?
Despite boasting an impressive 60.1% score on the MATH benchmark, Claude 3 Opus struggled with our math problem, giving wrong answers no matter how it prompted. GPT-4 and Gemini 1.5 Pro, on the other hand, solved it with ease.
5. Follow User Instructions
Now, this is where Claude 3 Opus really shines! When it comes to following user instructions to the letter, the Opus model outperforms all the others. It generated 10 perfect sentences ending with "apple", while GPT-4 managed 9, and Gemini 1.5 Pro barely squeezed out 3. If you need an AI that can follow orders like a champ, Claude 3 Opus is your guy!
6. Needle In a Haystack (NIAH) Test
Despite Anthropic's claims of Claude 3 Opus' prowess in handling long-context data, it couldn't find the needle in our 8K token haystack. GPT-4 and Gemini 1.5 Pro, however, located it with ease. More testing is needed, but so far, it's not looking great for the Opus model.
7. Guess the Movie (Vision Test)
In the image analysis test, Claude 3 Opus and GPT-4 both correctly guessed "Breakfast at Tiffany's", while Gemini 1.5 Pro missed the mark. Kudos to Anthropic for creating a model with solid image processing skills!
The Verdict
While Claude 3 Opus has its strengths (like following instructions and image analysis), it falls short in commonsense reasoning, math, and long-context data compared to GPT-4 and Gemini 1.5 Pro. However, there are specialized areas where it truly excels, such as rare language translation, quantum physics, and learning self-types annotation.
WHY DOES CLAUDE MATTER TO GTM PROS?
For sales, enablement, marketing, and customer success leaders, this new model offers exciting opportunities. Its ability to follow user instructions with precision can streamline content creation and personalization efforts.
Marketing teams can leverage its image analysis capabilities to create engaging visuals and enhance brand messaging.
Sales and customer success can utilize its language translation and specialized knowledge to better serve clients across the globe.
While Claude 3 Opus may not be the ultimate solution for every task, understanding its strengths and weaknesses can help GTM leaders make informed decisions and stay ahead of the curve in the rapidly evolving world of AI.
Now how do we pick between all 3? Let's summarize and break each one down:
Claude 3.0:
ChatGPT 4.0 with GPT Store:
Gemini 1.5 Advanced:
How do you even know where to start with AI?
I hear this question all the time and this last week, I spoke to a group of Sales Enablement Collective members about this very thing.
Short version, instead of focusing on the shiny new AI tech, just remember, it is a tool, and tools are meant to help us do what we do or need help with obviously.
To help with this, I am going to lean on a friend of mine Mike Kunkle who talks about a concept called COIN-OP (Challenges, Opportunities, Impacts, Needs, Outcomes, Priorities) in his Modern Sales Foundations methodology. Now usually he is talking about this in reference to a sales discovery process, but it applies as a good change management, initial step for us when trying to identify what to do and where to start with AI.
Questions below should not only identify where AI can be most impactful but also align AI initiatives with the strategic goals of the GTM process. Here's how the COIN-OP model can be tailored for this purpose:
Challenges (C)
Opportunities (O)
Impacts (I)
Needs (N)
Outcomes (O)
Priorities (P)
Buying Process
There are so many ways AI could be applied and used inside of a company, but in my opinion, unless a true GAP analysis has been done and after you have done so, put into place a change management plan. We will be diving more into this next week.
I hope you enjoyed this weeks podcast and newsletter, let me know if there is anything else you are missing or wanting to know about!
Excited to dive into this! Jonathan M K.
Host of 'The Smartest Podcast'
1 年Exciting topics ahead! ???
AI Experts - Join our Network of AI Speakers, Consultants and AI Solution Providers. Message me for info.
1 年Exciting lineup today! Can't wait to dive in.
Building Enablement Systems for Scalable Revenue Growth ?? | Blending Strategy, Systems Thinking, and Behavioural Design | MEDDPICC Certified | ICF Business Coach | Meditation Teacher ????♂?
1 年Will have a proper look later Jonathan
CEO/Founder of Replayz (Replayz joined forces with Highspot, the world's #1 Revenue Enablement Platform)
1 年Thanks again for inviting me on the Pod Jonathan. I really enjoyed our chat!