登录查看更多内容

Transforming GPT-4o into an AI Query Assistant for SFMC

Selim Sevim

Manager - Technical Integrator (Salesforce Marketing Cloud)

发布日期: 2025年3月3日

After recently starting to learn about AI models, prompt engineering, and fine-tuning, I quickly realized that public resources on integrating GPT into Salesforce Marketing Cloud (SFMC) were extremely limited—especially for practical, real-world use cases beyond content generation.

Most examples focus on generating text-based responses, but SFMC presents a unique landscape where GPT can be leveraged for automation, data extraction, and intelligent querying. There's so much more we can do.

In this article, I'll walk you through the process of transforming GPT-4o into an AI-powered Query Assistant that can interpret user queries, map them to Data Extensions (DEs), extract relevant fields, and generate SQL queries dynamically.

By the end, I'll also highlight another AI-driven applications you can build for SFMC—pushing the boundaries of what’s possible with AI in marketing automation.

Github repo: https://github.com/selimsevim/sfmc-ai-query-assistant

Key Features: What Makes This AI Query Assistant Interesting?

?? 1. Natural Language to Structured Query Understanding

Users can input questions in plain English, like: "Bring all the opens happened in last 30 days for EmailName = X" The assistant intelligently maps this request to the correct Data Extensions (_job, _open) and fields (EventDate, SubscriberKey, etc.) without the user needing to know DE names.

?? 2. Dynamic Data Extension & Field Lookup

Instead of hardcoding database tables, the assistant dynamically retrieves Data Extension (DE) references and field mappings from lookup tables in SFMC.

It cross-references user input with a DE Reference Table to determine which DEs to query.
It checks a Field Reference Table to infer the correct fields when the user doesn’t specify them explicitly.

?? 3. Intelligent Query Building with Context Awareness

The assistant doesn’t just list DEs—it understands relationships between data extensions and constructs queries accordingly with preconfigured data extensions.

? 4. SQL Query Generation with Strict Compliance Rules

No unnecessary fields – The assistant only includes the required fields in the SELECT clause.
Strict SQL rules – It enforces correct syntax, prevents invalid clauses (e.g., no CROSS JOIN, no ORDER BY unless inside ROW_NUMBER), and ensures queries are SFMC-compliant.
Joins are handled correctly – If multiple DEs are involved, it references a relationship table to determine the correct JOIN conditions.

?? 5. Automated De-Duplication Handling

When users request deduplicated data, the assistant automatically applies ROW_NUMBER() OVER (PARTITION BY … ORDER BY …) inside a subquery, ensuring only the latest or most relevant record is kept.

Breaking Down the AI Query Assistant: How It Works

User Input & First GPT Call

When a user enters a query, such as: ?? "Bring all the opens happened in last 30 days for EmailName = X"

The script extracts possible Data Extensions (DEs) and fields from two sources:

1- Direct user input (if the user explicitly names DEs or fields).

2- Lookup tables in SFMC that map user-friendly references to actual DE names and field names.

Dynamic Lookup for Data Extensions & Fields

The assistant dynamically retrieves DE and field references from SFMC’s lookup tables:

1?? DE Reference Table

Maps user-friendly terms to actual Data Extension names.
Example: The word "opens" might be mapped to "_open".

2?? Field Reference Table

Maps user-friendly terms to actual DE fields.
Example: The phrase "Open Date" could map to the "EventDate" field in _click.

Building the System Prompt for the First GPT Call

Once the relevant DEs and fields are identified, we send them to GPT-4o using a structured system prompt. The goal is to validate and refine the extracted information before proceeding to SQL generation.

The GPT system prompt includes:

? User’s input query

? Extracted DEs and fields

? Reference tables for further mapping

? Rules for handling missing DEs or fields

? JSON output format enforcement

Prompt

You are an AI that extracts Data Extensions (DEs) and fields from user queries in Salesforce Marketing Cloud.

### **Your Role:**
- Identify relevant **Data Extensions (DEs)** based on **explicit user input** OR **mapped references**.
- Identify **fields** either explicitly mentioned by the user OR inferred from the **Field Reference Table**.
- If the user explicitly states a DE, prioritize it over reference mappings.
- If no DEs can be inferred, return: `"Please explain yourself one more time."`
- If no fields can be inferred, leave the `fields` list empty.

### **Reference Table (User-Friendly Terms → Data Extensions)**
{{deMappings}} 
### **Field Reference Table (User-Friendly Terms → DE & Fields)**
{{fieldMappings}}
### **Output JSON Format (Strictly Follow This)**
```json
{
  "de_names": ["DE1", "DE2", ...],
  "fields": ["Field1", "Field2", ...]
}

.....

Expected GPT Response (First Interaction)

{ "de_names": ["_job", "_click"], "fields": ["EventDate"] }

At this point, we have successfully structured the query input and identified all necessary DEs and fields.

Retrieving All Available Fields for Identified Data Extensions

Now that we know which DEs we need, the assistant retrieves all fields for those Data Extensions using SFMC’s WSProxy:

It fetches all available columns for each DE.
It stores these fields to be used in the second GPT interaction.

Fetching Data Extension Relationships

To ensure proper joins in the SQL query, the script retrieves relationships between DEs. For example, _job and _open are related via JobID.

Building the System Prompt for the Second GPT Call

Now that we have:

? User’s refined request

? Final DE and field selection

? All available fields for those DEs

领英推荐

?? Say Goodbye to Spreadsheet Hell

Lex Sokolin 6 个月前

$Trillion Question for your CIO/CEO: Go Deep or Wide…

Rajiv Saxena 1 周前

How Coditas used ML to augment a sales acceleration…

Coditas 2 年前

? DE relationships

We send all this structured data to GPT-4o in the second interaction, instructing it to generate an SQL query.

The GPT system prompt includes:

? Strict SQL rules (no invalid clauses, optimized query structure)

? Correct table joins using relationship mappings

? Correct field selection (only existing fields, no hallucinations)

? De-duplication rules (if needed)

? Filters based on user query (e.g., last week’s data)

Prompt

You are an AI that generates SQL queries for Salesforce Marketing Cloud (SFMC). Only generate the query, do not explain anything. Follow these strict rules:

### **?? SQL Rules (Strict Adherence Required)**
- **Allowed Clauses:** SELECT, JOIN, WHERE, GROUP BY, HAVING, ROW_NUMBER.
- **Prohibited Clauses:** INSERT, UPDATE, DELETE, ORDER BY (unless inside `ROW_NUMBER() OVER (...)`), WITH.
- **Allowed Joins:** INNER JOIN, LEFT JOIN (NO CROSS JOIN).
- **Field Selection:** Do NOT use `*`. Always list fields explicitly.
- **Data Extension Naming:** Use square brackets `[ ]` for DEs ALWAYS.
- **Table Aliases:** Use aliases for readability, but keep full DE names in `FROM` and `JOIN`.
- **Query Efficiency:** Always optimize queries for readability and performance.

### **?? Data Views Handling Rules**
- **SFMC Data Views such as `_job`, `_open`, `_click`, `_bounce`, etc., must be handled with proper joins.**
- **If a user query involves opens, clicks, or sends, extract BOTH `_job` (Emails DE) AND the relevant event DE (`_open`, `_click`, etc.).**
- **Always JOIN `_job` with event-based Data Views using `JobID` or `SubscriberKey`, based on available relationships.**
- **Ensure that filters involving email sends, opens, clicks, or bounces use the correct event timestamps (e.g., `EventDate`).**

......

Expected GPT Response (Second Interaction)

SELECT o.SubscriberKey, o.EventDate, o.IsUnique, o.JobID
FROM _open o
INNER JOIN _job j ON o.JobID = j.JobID
WHERE j.EmailName = 'X' AND o.EventDate >= DATEADD(DAY, -30, GETDATE())

Challenges & Lessons Learned from Building the AI Query Assistant

Building an AI-powered Query Assistant in Salesforce Marketing Cloud (SFMC) comes with its own set of challenges. While the assistant successfully converts natural language into structured SQL queries, there are a few key areas to be mindful of:

1?? GDPR and Data Privacy

One important consideration is data privacy. If a user includes personal information in their query, such as:

WHERE EmailAddress = '[email protected]'

then that data is sent to GPT, potentially violating GDPR or other privacy regulations.

? Solution: Avoid sending PII (Personally Identifiable Information) to GPT or anonymize queries before processing them.

2?? Approaching Prompts Wisely

The prompt engineering in this assistant was an experimental approach. While it works well for general SFMC SQL queries, every business has unique data structures and SQL constraints.

? Recommendation: Adapt the prompts to match your company’s SQL rules and SFMC setup. A more precise prompt will yield better results and prevent GPT from making incorrect assumptions.

3?? Cost Considerations

Even though GPT-4o is relatively cheap, every API call still costs money. Since this assistant interacts with GPT twice per query, costs can scale up quickly if used frequently.

? Optimization Ideas:

Use caching to avoid redundant queries.
Run pre-checks before calling GPT (e.g., basic SQL templates for common queries).
Only call GPT if the user query is complex enough to need AI processing.

4?? Handling Data Views More Efficiently

I didn't focus too much on Data Views (_job, _open, _click, etc.) because they are fixed structures and don't require as much flexibility as other DEs. However, for a production-ready assistant, handling Data Views separately would be beneficial.

? Recommendation: Create a separate GPT prompt with different rules for Data Views because:

Field names and relationships are standardized in SFMC Data Views.
Query structures are predictable (e.g., _job always joins with _open, _click, etc.).

5?? Better Error Handling

The current SSJS error handling is scattered and not fully optimized. Different user queries may trigger unexpected responses, and handling errors properly is crucial for production use.

? Solution: Build a structured error-handling system that:

Detects invalid DEs/fields before sending requests to GPT.
Returns clear error messages instead of generic failures.
Logs errors for debugging and future improvements.

While doing all the tests, I only spent 0.67$.

Exploring Another AI-Powered Use Case in SFMC

Beyond SQL query generation, GPT can be used for other automation tasks in Salesforce Marketing Cloud (SFMC). One such use case is translating natural language instructions into structured JSON actions for managing Data Extensions (DEs).

This approach allows marketers, admins, and developers to issue commands in plain English, and GPT converts them into structured actionable steps that SFMC can execute.

?? Use Case: AI-Powered Data Extension Management

Instead of manually navigating SFMC’s UI to create, move, copy, or delete Data Extensions, we can build an assistant that understands natural language and generates structured JSON commands for execution.

How It Works

1?? The user enters a command like: "Move all DEs having 'Test' in their name to 'Archive'." or "Create a Data Extension by copying 'Customer_DE' but remove all PII fields."

GPT has a prompt like:

You are an AI assistant that translates natural language into structured JSON actions for Salesforce Marketing Cloud Data Extensions.

### **Instructions:**
1?? **Extract the main action type** (create, delete, update, move, copy).
2?? **Identify filter conditions** (contains, starts_with, ends_with, exact).
3?? **Standardize synonyms** (e.g., "starting with" → "starts_with", "having" → "contains").
4?? **Handle multiple actions** and return them in **sequential order**.
5?? **Always structure actions in this order when applicable**: 
   - (1) `create_folder` → (2) `get_data_extensions` → (3) `move_data_extensions`
6?? **If a folder needs to be created, do not check first. Just create it.**
7?? **Always retrieve Data Extensions before moving them.** 

### **JSON Output Format:**
```json
{
    "actions": [
        { "name": "<action_name>", "parameters": { "<param_key>": "<param_value>" } }
    ]
}

2?? GPT interprets the request and translates it into JSON-based structured actions like:

{
    "actions": [
        { "name": "get_data_extensions", "parameters": { "filter": { "type": "contains", "value": "Test" } } },
        { "name": "move_data_extensions", "parameters": { "target_folder": "Archive", "data_extensions": ["DE_1", "DE_2"] } }
    ]
}

3?? SFMC executes the actions via SSJS, parsing the JSON and performing the necessary operations.

4?? The user receives a confirmation message: ? "All data extensions including 'Test' in their name have been moved to the 'Archive' folder."

Conclusion

Integrating GPT-4o into Salesforce Marketing Cloud (SFMC) opens up new possibilities beyond simple content generation. Through this project, we explored how AI can understand natural language queries, extract relevant Data Extensions and fields, and generate structured SQL queries—helping marketers, developers, and data analysts work more efficiently.

Beyond SQL generation, we also saw how AI can automate Data Extension management, transforming plain English commands into structured JSON actions that SFMC can execute. These use cases only scratch the surface of what’s possible.

With careful prompt engineering, performance optimizations, and security considerations, AI can become a valuable assistant for SFMC users—saving time, reducing errors, and making complex tasks more accessible to non-technical users.

The key takeaway? GPT-4o isn't just a chatbot—it can be a powerful automation tool when structured properly. Whether it's querying data, managing DEs, or handling automation workflows, the right AI implementation can supercharge SFMC operations.

This project was an experiment, and there's still plenty of room for improvement. If you're looking to build your own AI-powered SFMC tools, I strongly encourage you to tailor prompts to your needs, optimize for cost and accuracy, and always consider security best practices.

Rohit Anand ?

1 周

Insightful????

1 次回应

Rodney Ackermans

1 周

Kenneth Rozendaal

1 次回应

Fran?ois-Xavier Anghui

Salesforce Business Analyst

3 周

Awesome Selim Sevim. How does it work ? How Connect GPT-4o to a marketing Cloud instance ?

1 次回应

Suman Basu

Solution Architect at Deloitte USI??| 10x Salesforce Certified |Marketing Cloud Personalization|Salesforce Data Cloud

3 周

Insightful

1 次回应

Victor Miller

3x Salesforce Certified Marketer | Marketing Technology Leader | Digital Lifecycle & Automation Expert

3 周

James Breezley - interesting use case for a cloudpage query assistant.

1 次回应

查看更多评论

要查看或添加评论，请登录

Selim Sevim的更多文章

Beyond Generic Newsletters: Alternative Email Triggers & Building Your Own AI-Powered Product Recommendation Tool

2025年3月17日

Beyond Generic Newsletters: Alternative Email Triggers & Building Your Own AI-Powered Product Recommendation Tool

After receiving the fifth generic email from a Swedish e-commerce giant in only 15 days, I started thinking: Does email…

4 条评论
Building an Automation Healthcheck Dashboard for SFMC

2025年2月10日

Building an Automation Healthcheck Dashboard for SFMC

Salesforce Marketing Cloud (SFMC) has provided AutomationInstance and AutomationActivityInstance data views for over…

10 条评论
SFMC Asset Relationship Finder: Discover Where Your Assets Are Used

2024年9月9日

SFMC Asset Relationship Finder: Discover Where Your Assets Are Used

In my 8 years of experience with Salesforce Marketing Cloud, I have always envisioned a solution that could help users…

21 条评论
Creating a Custom Journey Builder Activity: A Quick Overview

2024年8月6日

Creating a Custom Journey Builder Activity: A Quick Overview

A while ago, I created my first SFMC app and thoroughly enjoyed the technical challenges I faced along the way. I've…

36 条评论
Creating an SFMC app - A new technical challenge

2024年2月22日

Creating an SFMC app - A new technical challenge

?? Excited to share my latest venture into the Salesforce Marketing Cloud (SFMC) universe! As a developer, I've always…

8 条评论

See all articles

Key Features: What Makes This AI Query Assistant Interesting?

?? 1. Natural Language to Structured Query Understanding

?? 2. Dynamic Data Extension & Field Lookup

?? 3. Intelligent Query Building with Context Awareness

? 4. SQL Query Generation with Strict Compliance Rules

?? 5. Automated De-Duplication Handling

Breaking Down the AI Query Assistant: How It Works

User Input & First GPT Call

Dynamic Lookup for Data Extensions & Fields

Building the System Prompt for the First GPT Call

Prompt

Expected GPT Response (First Interaction)

Retrieving All Available Fields for Identified Data Extensions

Fetching Data Extension Relationships

Building the System Prompt for the Second GPT Call

领英推荐

Prompt

Expected GPT Response (Second Interaction)

Challenges & Lessons Learned from Building the AI Query Assistant

1?? GDPR and Data Privacy

2?? Approaching Prompts Wisely

3?? Cost Considerations

4?? Handling Data Views More Efficiently

5?? Better Error Handling

Exploring Another AI-Powered Use Case in SFMC

?? Use Case: AI-Powered Data Extension Management

How It Works

Conclusion

Selim Sevim的更多文章

Beyond Generic Newsletters: Alternative Email Triggers & Building Your Own AI-Powered Product Recommendation Tool

Building an Automation Healthcheck Dashboard for SFMC

SFMC Asset Relationship Finder: Discover Where Your Assets Are Used

Creating a Custom Journey Builder Activity: A Quick Overview

Creating an SFMC app - A new technical challenge

社区洞察

其他会员也浏览了

Vartik Agentic GPT

Zoho Raises the Bar in Self-Service BI with AI-Powered Zoho Analytics Platform

The In-Sourced Analytics Revolution: Leveraging Popular Tech for Sales & Marketing Transformation

'Tis the time to take Tableau out

5 key benefits of retrieval-augmented generation (RAG)

BI Is Back, and Better Than Ever: How AI Reinvigorates Business Intelligence

Model Context Protocol (MCP): The Universal Standard Transforming AI Integration

The Benefits and Usefulness of Implementing Enterprise Search Using LLM

How to enable your AI-assisted Analysts with Microsoft Copilot

Re-Organizing Analytics in the Age of AI: Self-Service Analytics is the Key.