登录查看更多内容

Using Large Language Models for SFDC Account Enrichment

Glenn Vander Laan

Senior Director, Business Process Systems Automation at Klaviyo

发布日期: 2023年7月7日

SFDC Account Data and the Quality Problem

Data is king for an effective go-to-market strategy. However, managing vast amounts of data can be overwhelming as your data set continues to grow year over year. My team works primarily with Salesforce Account data and we are tasked with making the data as actionable and useful as possible. We all recognize that poor data quality can result in a range of issues such as lost revenue, reduced productivity, and decreased customer satisfaction. These data quality issues can be costly, affecting both the top and bottom lines of any organization. According to a 2022 study by Gartner, bad data costs organizations an average of $12.9M a year.

When it comes to account data in Salesforce, I think about it in two ways. The first is firmographic data, which includes "hard" data about a company such as its address, employee size, annual revenue, industry, and parent/child relationships. This data is necessary to carve territories for sales reps and establish a segmentation approach. Leveraging third-party data providers is essential to manage any account database at scale and drive the proper classification and segmentation of accounts. As a sales operations professional, trusting the data is important, but it's also important to acknowledge that the data may be imperfect. The challenge lies in determining which data is correct and which is wrong. A typical approach is to make the best decision with the available data and then crowdsource feedback from sales reps with proof supporting their position that the data is incorrect. Another common approach is to bring in low-cost resources and train them to manually scrub a subset of the account base to try and make the data cleaner for the most critical target accounts. These approaches can be time-consuming and limited in validating and updating data. In addition, if these processes are not managed in a well-thought-out process continuously, additional data degradation will occur.

The second type of data is the additional attributes about an account that help drive effective campaigns and targeted sales plays, which I think about more as the “soft” account data. Often these needs are more nuanced, where the goal is to be able to further segment accounts around a "Propensity To Buy" based on a specific business model or other market drivers. These “soft” account data types are typically not cut and dry and different people in the organization have varied needs for different attributes to suit their requirements. I have always viewed the requests for this type of data a bit fuzzy, since the value is largely tied to the specific stakeholder making the request. As a result, often it is difficult to clearly define the requirements and find the appropriate place to source the data from.?

Case Study: Using Large Language Models to Classify Companies

Recently, working with our business partners, we had a request come up to augment our account base with some “soft” data so we could take a more targeted approach in marketing. This scenario was around identifying the company's business model as being B2B, B2C, or in some situations companies like Google and Apple, supporting both. This information would be important to be able to more effectively market specific products to one audience or the other.

Naturally, the first step was to assess the data sources we have to determine if we could leverage any information from our existing dataset to provide insight. Since we didn't have anything in-house, the next logical step was to explore available data sources for purchase and assess if the data quality would be adequate for our needs. Data can be expensive, and when requirements are more nuanced, the value of the data decreases because in some cases, we are attempting to fit a square peg into a round hole.

To evaluate several data providers that could support our use cases, the business team compiled a list of a few hundred companies across B2B, B2C, and the B2B & B2C use cases and classified them based on a "human" baseline interpretation by reviewing their websites. We then sent the company list to the vendors for enrichment so we could analyze the result set against our baseline. It is important to note that this exercise was somewhat subjective in that the definition of B2B and B2C can be nuanced. As a result I wouldn't expect any approach to provide a 100% match, and we would be looking for an approach that could best meet our definitions and requirements, As is typical, the prices we received varied greatly, and at the end of the day, the incremental cost to obtain this data was not insignificant. Additionally, the approach we considered was a "one-time" enrichment, which was suitable for our existing account base but did not address new prospect accounts entering our database, which are arguably the most promising accounts to market to.

During the evaluation process, I experienced an epiphany. Like the rest of the tech industry, I have been amazed by the introduction of and rapid evolution within the AI space. Large Language Models are simply magical. As with everyone else assessing and absorbing this new technology, I have been exploring its potential to enhance my writing, summarize key points, and use it to solve problems or answer specific questions.

While thinking through this problem I was curious if this was something we could potentially solve leveraging Large Language Models (LLMs). I proceeded to test several different LLMs to see if through some simple prompts we may be able to get data that would be sufficient to satisfy our need. After some initial testing the results were promising so I considered moving forward with a Proof of Concept to test potentially operationalizing an enrichment framework leveraging LLMs.

To develop this framework, I recruited a talented developer to help brainstorm and build out the vision. Our requirements were straightforward, and we wanted to test a few different things:

Leverage the LLMs to make requests to classify data as B2B, B2C, or Both based on user and system prompts and simply passing the URL of the website. Could we get results that were comparable to or better than those of our third-party data providers?
Build a near-real-time enrichment engine that analyzes new accounts on a scheduled basis and updates a field on the SFDC account record with the new attribute in a Test Environment.

领英推荐

Data Hygiene: How to Cleanse Your CRM

ZoomInfo 11 个月前

Duplicate Rules in Salesforce

Salesforce Geek 1 年前

Day 86: Common SFMC Challenges and How to Solve Them

Sumit Kakade (SK) ?? 1 周前

After three days we were able to accomplish both goals.? Making a half dozen adjustments to the prompt, we were able to do much better than one of the data providers and came reasonably close to another. Even without further optimizing the prompt the results we are seeing are acceptable to the business and we would have about a 90% cost savings.??

Testing Prompts at Scale

For this specific proof-of-concept (POC) use case, we utilized the data captured by the business team during the sampling process of approximately 300 companies. These companies were manually classified as B2B, B2C, or both. This data served as a crucial baseline against which we could compare the subsequent results from the LLM.

To efficiently test prompts at scale, we developed an Enrichment Engine that enables us to evaluate the impact of prompt changes on the data output of the LLMs more effectively. This Enrichment Engine simplifies the process of rapidly testing multiple prompts. Furthermore, it is designed to be agnostic to the specific LLM being used. With the established architecture, we now have the ability to test any number of prompts against any data set as we continue to do our testing and evaluations. Additionally, the Enrichment Engine offers the flexibility to test various LLMs.

The user has the option to select the desired LLM (we are currently testing multiple models) and upload text files containing multiple prompts. Additionally, the user can upload the corresponding data set to be applied to the prompt. The Enrichment Engine executes the selected prompt and generates the corresponding outputs, which are then stored in CSV files. This structured format facilitates easy analysis and utilization of the output data.

By comparing the results obtained from the Enrichment Engine to the classifications determined by humans during the sampling phase, we can evaluate the effectiveness and accuracy of the Enrichment Engine's outputs in relation to the established baseline. This comparison allows us to assess performance and make any necessary adjustments to achieve the desired level of alignment.

The Enrichment Engine is designed to be adaptable and has the potential to expand and accommodate additional formats in the future. This flexibility ensures compatibility with diverse data sources, addressing the specific requirements of users.

Leveraging Large Language Models for Enrichment and to Improve Data Quality

This single use case from our POC has opened up a flood of ideas around potentially leveraging LLMs for additional account enrichment as well as data validation to improve quality.

As an example, we're considering using this approach to validate the Child and Parent relationships for some of our key accounts. As mentioned earlier, the traditional method involves hiring and training low-cost resources to manually go through this process, which can be complex and error-prone. While relying on LLMs may not necessarily guarantee perfect results for this use case, it could still provide significant help in assessing a large amount of data quickly to speed the overall process along. This is just one of many potential use cases where this approach could be used to assess data quality at scale. Still a big open question in the corporate world around LLM’s revolves around security. We are still evaluating the overall strategy around how to safely and securely leverage LLM’s in these business contexts, but the possibilities are very intriguing and we will continue to evaluate how they can be safely and effectively leveraged.? Although we know that the data will never be perfect, I'm hopeful that leveraging LLMs and these new processes can make a significant difference in improving quality while also lowering costs.

AI in GTM Operations

568 位关注者

Darren Ernest

GTM Strategy | Product & Performance Marketing | Leveraging Data Analytics & AI to Drive Growth, Efficiency & Innovation | ex-Salesforce, ex-Ogilvy, ex-Publicis

11 个月

Hi Glenn Vander Laan I've been experimenting with this recently. Wondering a year later where you've landed with this. Wondering if you would be willing to connect briefly to discuss. I've been experimenting lots of "soft" data and I've found chatgpt for example to be incredibly effective though you really have to eyeball the results because there are glaring errors from time to time, but the speed and cost tradeoff is worth it when you can get 90+ accuracy. In fact, I've added to my prompt a confidence rating on its answers so I can more easily pinpoint the ones I need to manually verify in addition to the eyeballing the whole data set. Another technique is to get it to provide rationale or explanation for its answer, and that helps with the eyeballing since I am not personally familiar with all the the accounts personally.

Kevin Laughlin, PMP

Strategy & Business Operations | Enabling Sales Systems with GenAI

1 年

As someone who has purchased datasets and manually enriched data with a low cost team, I can appreciate testing the effectiveness of using LLMs to enrich the "hard" data to segment accounts more accurately. I can see the Enrichment Engine also finding patterns in the "soft" data that help personalize messaging to new accounts and improve the Propensity model with explainable business behaviors. Glenn Vander Laan, have you looked at training on a large sample of conversational data with won/loss outcomes to find the best customer journey experience for a given buyer persona?

Glenn Vander Laan

Senior Director, Business Process Systems Automation at Klaviyo

1 年

Thats right Richard Coffman. In order to accurately assess the best approach, you need to evaluate what combination of LLM and Prompt drives the best result.

1 次回应

Richard Coffman

Director of Enterprise Sales @ Pro-Vigil | AI-driven Remote Crime Prevention

1 年

Good stuff Glenn Vander Laan. Dumb question, but comparing the system-generated results with the sampling data will need to be done on each use case/LLM pair in order to confirm the LLM that works best for that use case, correct?

1 次回应

Jackie Corcoran

Client Success Manager | Passionate about Customer Experience, Business Outcomes & Facilitating Meaningful Connections

1 年

So interesting Glenn Vander Laan! Thanks for sharing. You've got my wheels turning.

1 次回应

查看更多评论

要查看或添加评论，请登录

Glenn Vander Laan的更多文章

Is ChatGPT the Netscape of 2025?

2024年12月19日

Is ChatGPT the Netscape of 2025?

The Browser Wars: A Historical Prelude In the late 1990s and early 2000s, the technology landscape was dramatically…

2 条评论
The Rise of AI Agents: Why Data Strategy Remains the Key to Success

2024年11月25日

The Rise of AI Agents: Why Data Strategy Remains the Key to Success

In recent months, the artificial intelligence landscape has witnessed a significant shift toward AI Agents, promising…

1 条评论
The AI Crossroads in Sales Operations: Vendor Solutions vs. In-House Development

2024年10月18日

The AI Crossroads in Sales Operations: Vendor Solutions vs. In-House Development

In the ever-evolving landscape of sales operations, artificial intelligence (AI) has emerged as a game-changing force…

4 条评论
Building Your Foundation - Mastering Core Concepts in Sales Ops

2024年7月1日

Building Your Foundation - Mastering Core Concepts in Sales Ops

Introduction: Be an expert at your craft before leaning on AI. In the age of rapidly changing technologies and…

3 条评论
Navigating the Complex Landscape of Generative AI in Business

2023年8月18日

Navigating the Complex Landscape of Generative AI in Business

Introduction The integration of artificial intelligence (AI) into business operations has opened up transformative…

4 条评论
Talk to your Enterprise Knowledge Base with AI

2023年8月4日

Talk to your Enterprise Knowledge Base with AI

The Information Challenge Sales organizations today face a significant challenge in getting the right information to…

7 条评论
The Power of Prompt Engineering

2023年7月21日

The Power of Prompt Engineering

Introduction: Prompt engineering has emerged as a pivotal technique in leveraging the capabilities of modern language…

2 条评论
Learning about AI: Fundamentals, Ethics and Keeping Up

2023年6月23日

Learning about AI: Fundamentals, Ethics and Keeping Up

Introduction: In this era of rapid technological advancements, artificial intelligence (AI) has emerged as a…

4 条评论
Navigating the AI Revolution in GTM Operations: Balancing Tools, People, and Processes

2023年6月9日

Navigating the AI Revolution in GTM Operations: Balancing Tools, People, and Processes

Introduction: In the ever-evolving landscape of sales and go-to-market operations, the integration of artificial…

3 条评论

See all articles

Using Large Language Models for SFDC Account Enrichment

Glenn Vander Laan

Senior Director, Business Process Systems Automation at Klaviyo

SFDC Account Data and the Quality Problem

Case Study: Using Large Language Models to Classify Companies

领英推荐

Testing Prompts at Scale

Leveraging Large Language Models for Enrichment and to Improve Data Quality

AI in GTM Operations

568 位关注者

Glenn Vander Laan的更多文章

社区洞察

其他会员也浏览了

Stop Juggling Deals: Harness the Power of Vertica CRM's Pipeline Tools for Maximum Efficiency

Salesforce Forms 101

How to Create Personalized Salesforce Dashboards for Improved Understanding

Hidden Risks and Costs of Salesforce Mismanagement

Boost Your B2B Database Management with the 10 Best Strategies

The Best Techniques for Cleaning and Optimising Salesforce Data

Common mistakes to avoid when implementing Salesforce

Why Choose Heat Map Over Salesforce Maps?

Salesforce Predictions 2020: What does the Future Hold for Salesforce?

The Death of the CRM

SFDC Account Data and the Quality Problem

Case Study: Using Large Language Models to Classify Companies

领英推荐

Testing Prompts at Scale

Leveraging Large Language Models for Enrichment and to Improve Data Quality

AI in GTM Operations

568 位关注者

Glenn Vander Laan的更多文章

Is ChatGPT the Netscape of 2025?

The Rise of AI Agents: Why Data Strategy Remains the Key to Success

The AI Crossroads in Sales Operations: Vendor Solutions vs. In-House Development

Building Your Foundation - Mastering Core Concepts in Sales Ops

Navigating the Complex Landscape of Generative AI in Business

Talk to your Enterprise Knowledge Base with AI

The Power of Prompt Engineering

Learning about AI: Fundamentals, Ethics and Keeping Up

Navigating the AI Revolution in GTM Operations: Balancing Tools, People, and Processes

社区洞察

其他会员也浏览了

Stop Juggling Deals: Harness the Power of Vertica CRM's Pipeline Tools for Maximum Efficiency

Salesforce Forms 101

How to Create Personalized Salesforce Dashboards for Improved Understanding

Hidden Risks and Costs of Salesforce Mismanagement

Boost Your B2B Database Management with the 10 Best Strategies

The Best Techniques for Cleaning and Optimising Salesforce Data

Common mistakes to avoid when implementing Salesforce

Why Choose Heat Map Over Salesforce Maps?

Salesforce Predictions 2020: What does the Future Hold for Salesforce?

The Death of the CRM