Transforming GPT-4o into an AI Query Assistant for SFMC
After recently starting to learn about AI models, prompt engineering, and fine-tuning, I quickly realized that public resources on integrating GPT into Salesforce Marketing Cloud (SFMC) were extremely limited—especially for practical, real-world use cases beyond content generation.
Most examples focus on generating text-based responses, but SFMC presents a unique landscape where GPT can be leveraged for automation, data extraction, and intelligent querying. There's so much more we can do.
In this article, I'll walk you through the process of transforming GPT-4o into an AI-powered Query Assistant that can interpret user queries, map them to Data Extensions (DEs), extract relevant fields, and generate SQL queries dynamically.
By the end, I'll also highlight another AI-driven applications you can build for SFMC—pushing the boundaries of what’s possible with AI in marketing automation.
Key Features: What Makes This AI Query Assistant Interesting?
?? 1. Natural Language to Structured Query Understanding
Users can input questions in plain English, like: "Bring all the opens happened in last 30 days for EmailName = X" The assistant intelligently maps this request to the correct Data Extensions (_job, _open) and fields (EventDate, SubscriberKey, etc.) without the user needing to know DE names.
?? 2. Dynamic Data Extension & Field Lookup
Instead of hardcoding database tables, the assistant dynamically retrieves Data Extension (DE) references and field mappings from lookup tables in SFMC.
?? 3. Intelligent Query Building with Context Awareness
The assistant doesn’t just list DEs—it understands relationships between data extensions and constructs queries accordingly with preconfigured data extensions.
? 4. SQL Query Generation with Strict Compliance Rules
?? 5. Automated De-Duplication Handling
When users request deduplicated data, the assistant automatically applies ROW_NUMBER() OVER (PARTITION BY … ORDER BY …) inside a subquery, ensuring only the latest or most relevant record is kept.
Breaking Down the AI Query Assistant: How It Works
User Input & First GPT Call
When a user enters a query, such as: ?? "Bring all the opens happened in last 30 days for EmailName = X"
1- Direct user input (if the user explicitly names DEs or fields).
2- Lookup tables in SFMC that map user-friendly references to actual DE names and field names.
Dynamic Lookup for Data Extensions & Fields
The assistant dynamically retrieves DE and field references from SFMC’s lookup tables:
1?? DE Reference Table
2?? Field Reference Table
Building the System Prompt for the First GPT Call
Once the relevant DEs and fields are identified, we send them to GPT-4o using a structured system prompt. The goal is to validate and refine the extracted information before proceeding to SQL generation.
The GPT system prompt includes:
? User’s input query
? Extracted DEs and fields
? Reference tables for further mapping
? Rules for handling missing DEs or fields
? JSON output format enforcement
Prompt
You are an AI that extracts Data Extensions (DEs) and fields from user queries in Salesforce Marketing Cloud.
### **Your Role:**
- Identify relevant **Data Extensions (DEs)** based on **explicit user input** OR **mapped references**.
- Identify **fields** either explicitly mentioned by the user OR inferred from the **Field Reference Table**.
- If the user explicitly states a DE, prioritize it over reference mappings.
- If no DEs can be inferred, return: `"Please explain yourself one more time."`
- If no fields can be inferred, leave the `fields` list empty.
### **Reference Table (User-Friendly Terms → Data Extensions)**
{{deMappings}}
### **Field Reference Table (User-Friendly Terms → DE & Fields)**
{{fieldMappings}}
### **Output JSON Format (Strictly Follow This)**
```json
{
"de_names": ["DE1", "DE2", ...],
"fields": ["Field1", "Field2", ...]
}
.....
Expected GPT Response (First Interaction)
{ "de_names": ["_job", "_click"], "fields": ["EventDate"] }
At this point, we have successfully structured the query input and identified all necessary DEs and fields.
Retrieving All Available Fields for Identified Data Extensions
Now that we know which DEs we need, the assistant retrieves all fields for those Data Extensions using SFMC’s WSProxy:
Fetching Data Extension Relationships
To ensure proper joins in the SQL query, the script retrieves relationships between DEs. For example, _job and _open are related via JobID.
Building the System Prompt for the Second GPT Call
Now that we have:
? User’s refined request
? Final DE and field selection
? All available fields for those DEs
领英推荐
? DE relationships
We send all this structured data to GPT-4o in the second interaction, instructing it to generate an SQL query.
The GPT system prompt includes:
? Strict SQL rules (no invalid clauses, optimized query structure)
? Correct table joins using relationship mappings
? Correct field selection (only existing fields, no hallucinations)
? De-duplication rules (if needed)
? Filters based on user query (e.g., last week’s data)
Prompt
You are an AI that generates SQL queries for Salesforce Marketing Cloud (SFMC). Only generate the query, do not explain anything. Follow these strict rules:
### **?? SQL Rules (Strict Adherence Required)**
- **Allowed Clauses:** SELECT, JOIN, WHERE, GROUP BY, HAVING, ROW_NUMBER.
- **Prohibited Clauses:** INSERT, UPDATE, DELETE, ORDER BY (unless inside `ROW_NUMBER() OVER (...)`), WITH.
- **Allowed Joins:** INNER JOIN, LEFT JOIN (NO CROSS JOIN).
- **Field Selection:** Do NOT use `*`. Always list fields explicitly.
- **Data Extension Naming:** Use square brackets `[ ]` for DEs ALWAYS.
- **Table Aliases:** Use aliases for readability, but keep full DE names in `FROM` and `JOIN`.
- **Query Efficiency:** Always optimize queries for readability and performance.
### **?? Data Views Handling Rules**
- **SFMC Data Views such as `_job`, `_open`, `_click`, `_bounce`, etc., must be handled with proper joins.**
- **If a user query involves opens, clicks, or sends, extract BOTH `_job` (Emails DE) AND the relevant event DE (`_open`, `_click`, etc.).**
- **Always JOIN `_job` with event-based Data Views using `JobID` or `SubscriberKey`, based on available relationships.**
- **Ensure that filters involving email sends, opens, clicks, or bounces use the correct event timestamps (e.g., `EventDate`).**
......
Expected GPT Response (Second Interaction)
SELECT o.SubscriberKey, o.EventDate, o.IsUnique, o.JobID
FROM _open o
INNER JOIN _job j ON o.JobID = j.JobID
WHERE j.EmailName = 'X' AND o.EventDate >= DATEADD(DAY, -30, GETDATE())
Challenges & Lessons Learned from Building the AI Query Assistant
Building an AI-powered Query Assistant in Salesforce Marketing Cloud (SFMC) comes with its own set of challenges. While the assistant successfully converts natural language into structured SQL queries, there are a few key areas to be mindful of:
1?? GDPR and Data Privacy
One important consideration is data privacy. If a user includes personal information in their query, such as:
WHERE EmailAddress = '[email protected]'
then that data is sent to GPT, potentially violating GDPR or other privacy regulations.
? Solution: Avoid sending PII (Personally Identifiable Information) to GPT or anonymize queries before processing them.
2?? Approaching Prompts Wisely
The prompt engineering in this assistant was an experimental approach. While it works well for general SFMC SQL queries, every business has unique data structures and SQL constraints.
? Recommendation: Adapt the prompts to match your company’s SQL rules and SFMC setup. A more precise prompt will yield better results and prevent GPT from making incorrect assumptions.
3?? Cost Considerations
Even though GPT-4o is relatively cheap, every API call still costs money. Since this assistant interacts with GPT twice per query, costs can scale up quickly if used frequently.
? Optimization Ideas:
4?? Handling Data Views More Efficiently
I didn't focus too much on Data Views (_job, _open, _click, etc.) because they are fixed structures and don't require as much flexibility as other DEs. However, for a production-ready assistant, handling Data Views separately would be beneficial.
? Recommendation: Create a separate GPT prompt with different rules for Data Views because:
5?? Better Error Handling
The current SSJS error handling is scattered and not fully optimized. Different user queries may trigger unexpected responses, and handling errors properly is crucial for production use.
? Solution: Build a structured error-handling system that:
Exploring Another AI-Powered Use Case in SFMC
Beyond SQL query generation, GPT can be used for other automation tasks in Salesforce Marketing Cloud (SFMC). One such use case is translating natural language instructions into structured JSON actions for managing Data Extensions (DEs).
This approach allows marketers, admins, and developers to issue commands in plain English, and GPT converts them into structured actionable steps that SFMC can execute.
?? Use Case: AI-Powered Data Extension Management
Instead of manually navigating SFMC’s UI to create, move, copy, or delete Data Extensions, we can build an assistant that understands natural language and generates structured JSON commands for execution.
How It Works
1?? The user enters a command like: "Move all DEs having 'Test' in their name to 'Archive'." or "Create a Data Extension by copying 'Customer_DE' but remove all PII fields."
GPT has a prompt like:
You are an AI assistant that translates natural language into structured JSON actions for Salesforce Marketing Cloud Data Extensions.
### **Instructions:**
1?? **Extract the main action type** (create, delete, update, move, copy).
2?? **Identify filter conditions** (contains, starts_with, ends_with, exact).
3?? **Standardize synonyms** (e.g., "starting with" → "starts_with", "having" → "contains").
4?? **Handle multiple actions** and return them in **sequential order**.
5?? **Always structure actions in this order when applicable**:
- (1) `create_folder` → (2) `get_data_extensions` → (3) `move_data_extensions`
6?? **If a folder needs to be created, do not check first. Just create it.**
7?? **Always retrieve Data Extensions before moving them.**
### **JSON Output Format:**
```json
{
"actions": [
{ "name": "<action_name>", "parameters": { "<param_key>": "<param_value>" } }
]
}
2?? GPT interprets the request and translates it into JSON-based structured actions like:
{
"actions": [
{ "name": "get_data_extensions", "parameters": { "filter": { "type": "contains", "value": "Test" } } },
{ "name": "move_data_extensions", "parameters": { "target_folder": "Archive", "data_extensions": ["DE_1", "DE_2"] } }
]
}
3?? SFMC executes the actions via SSJS, parsing the JSON and performing the necessary operations.
4?? The user receives a confirmation message: ? "All data extensions including 'Test' in their name have been moved to the 'Archive' folder."
Conclusion
Integrating GPT-4o into Salesforce Marketing Cloud (SFMC) opens up new possibilities beyond simple content generation. Through this project, we explored how AI can understand natural language queries, extract relevant Data Extensions and fields, and generate structured SQL queries—helping marketers, developers, and data analysts work more efficiently.
Beyond SQL generation, we also saw how AI can automate Data Extension management, transforming plain English commands into structured JSON actions that SFMC can execute. These use cases only scratch the surface of what’s possible.
With careful prompt engineering, performance optimizations, and security considerations, AI can become a valuable assistant for SFMC users—saving time, reducing errors, and making complex tasks more accessible to non-technical users.
The key takeaway? GPT-4o isn't just a chatbot—it can be a powerful automation tool when structured properly. Whether it's querying data, managing DEs, or handling automation workflows, the right AI implementation can supercharge SFMC operations.
This project was an experiment, and there's still plenty of room for improvement. If you're looking to build your own AI-powered SFMC tools, I strongly encourage you to tailor prompts to your needs, optimize for cost and accuracy, and always consider security best practices.
Lead Marketing Cloud Developer| Pardot Consultant|Marketing cloud Automation|Data cloud|Salesforce Admin|3X Ranger
1 周Insightful????
Salesforce CRM and Marketing Cloud Coordinator @Tilburg University | 7 x Certified Salesforce Consultant | Marketing Cloud | Sales Cloud | Data Cloud | Account Engagement
1 周Kenneth Rozendaal
Salesforce Business Analyst
3 周Awesome Selim Sevim. How does it work ? How Connect GPT-4o to a marketing Cloud instance ?
Solution Architect at Deloitte USI??| 10x Salesforce Certified |Marketing Cloud Personalization|Salesforce Data Cloud
3 周Insightful
3x Salesforce Certified Marketer | Marketing Technology Leader | Digital Lifecycle & Automation Expert
3 周James Breezley - interesting use case for a cloudpage query assistant.