Vibe Prompting and Agent Debugging
Vibe Prompting - A Two Part Series is sponsored by Agent.ai - Discover, connect with and hire AI agents to do useful things.
This is a follow on to my earlier article on Vibe Prompting and Agent Building where I borrowed from Andrej Karpathy's tweet on Vibe Coding.
While I have found this approach, of leveraging ChatGPT to generate the prompts I use to build my agents extremely useful, it becomes even more valuable to debug my agents. For those that spend a lot of time building agents, getting the agent to generate the output exactly the way you want it, every time is a time consuming process. There is a lot of iteration and trial and error.
Getting my agent 80% working is easy, getting from 80% to 98% requires a lot of blood sweat and tears.
Often you have no idea why the LLM just does not listen to you and this is where Vibe Prompting for debugging is invaluable. The process is quite simple, I take the prompt, the input to the prompt, the output of the execution of the prompt and dump it all into ChatGPT and then have a conversation with ChatGPT on why the output is not what I expected and then ask it to rewrite the prompt to generate the output I want. Then I test it out in ChatGPT and once it looks good I plop it back into Agent.ai
So going back to our earlier example - Building an agent that will analyze any CSV file, I generated all my prompts in ChatGPT then built the agent in Agent.ai and then ran the agent and as expected it did not work. The output was off.
So here is how I debug my agents using ChatGPT.
STEP 1 : My Question to ChatGPT
Why is this prompt with the uploaded data set generating this output. some of the questions seem too generic and not specific to the dataset uploaded.. I don't want the questions to be specific to the dataset I am uploading (it is an example dataset) but I do want the questions to be specific to the dataset I upload.. . -
Basically, my prompt is supposed to ask Clarifying Questions about the data set and my version of the prompt was generating extremely generic questions. If I uploaded Fitbit data or P&L data it asked similar questions as opposed to being specific to the data that was. being uploaded.
STEP 2 : Provide the prompt
BEGIN PROMPT
You are an AI agent that processes CSV files. Your first task is to understand the file structure and content before performing any analysis.
### **Step 1: File Structure Analysis**
1. Identify the **columns and their data types** (e.g., numerical, categorical, text, date).
2. Display **a few representative sample rows** to understand patterns in the data.
3. Detect **missing values, formatting inconsistencies, and potential errors**.
4. Identify **any special formatting (e.g., currency symbols, percentages, negative values in parentheses, timestamp formats).**
5. Determine **if any columns appear redundant, inconsistently formatted, or require renaming for clarity.**
### **Step 2: Ask Clarifying Questions**
Based on the initial understanding, generate **3-5 clarifying questions** about the dataset to confirm assumptions. The questions should:
- **Adapt to the structure and content of the dataset** (e.g., financial data, user logs, survey responses, sensor data).
- **Avoid assumptions about domain-specific use cases** unless clear patterns are detected.
- **Help refine the structure of the data** for meaningful analysis.
### **Output Format**
- **Return only the clarifying questions.**
- **No explanations, summaries, or metadata—just the list of questions.**
- ** No citations**
- **Each question should be presented as a separate line.**
- **Do not include "File_Structure_Analysis" in the output.**
#### **Expected Output Example:**
Can you confirm the intended meaning of each column, particularly the first identifier and the repeated numeric values observed in columns 4 and 5?
Should the CSV be assumed to have header names, or would you like to assign descriptive names based on the inferred data (e.g., ID, Date, Steps, Distance, etc.)?
Are zeros in certain fields to be regarded as valid entries or as placeholders for missing data?
Can you verify if the date column follows a strict MM/DD/YYYY format, or if any additional date parsing is required?
Would you like any further cleanup on potential redundant columns or inconsistent naming based on the current structure?
### **Final Action**
Now, based on the provided CSV file, **generate only the list of clarifying questions**, following the format above.
CSV File - {{assistant_files}}
END PROMPT
STEP 3 : Share the output that I am not happy with. You can see that the clarifying questions that my prompt generated is too generic. It looks like it is not generating these questions based on the dataset I am uploading.
BEGIN OUTPUT
"Do you confirm that the first row of the CSV represents the column headers with accurate descriptive names for each field?
Should numeric values formatted with currency symbols, commas, or parentheses be converted to standard numeric types for analysis?
Are there any date or timestamp columns, and if so, could you specify the expected format (e.g., MM/DD/YYYY)?
Are there any special formatting rules or markers (e.g., percentages, negative values in parentheses) that need to be consistently processed?
Do any columns appear redundant or require renaming to better reflect their content or purpose in the dataset?"
END OUTPUT
STEP 4 : ChatGPT analyzes all this information and then explains to me why the output is off and what should be done to fix it.
Got it! Your goal is to have the AI generate clarifying questions that are specific to the dataset uploaded, regardless of whether it's P&L data, Fitbit data, Credit Card data, or any other type of dataset.
The issue with your current prompt is that it guides the AI towards general data-cleaning issues rather than helping it detect the dataset's unique structure and purpose. Below is a refined approach that ensures the AI adapts better:
STEP 5 : It generates a new prompt that I then have it rerun.
Can you run this prompt with this data
STEP 6 : I validate the output and if it looks good. You can how these questions are way more specific to the dataset I uploaded (Manufacturing P&L). Now that I like the output I update it in Agent.ai
Based on the uploaded Manufacturing P&L dataset, here are 5 clarifying questions specific to its structure and inferred purpose:
Should the empty unnamed columns be removed, or do they contain meaningful hidden data (e.g., formatting for subcategories)?
The column headers seem misaligned—should we assume that the first row contains actual headers, or should we restructure them?
Are the revenue and cost figures formatted as currency (e.g., "$1,434,665"), and should they be converted into standard numeric values for analysis?
The dataset includes multiple time periods (Three Months Ended, December 31, 2024 & 2023)—should comparisons be calculated, or are these already adjusted values?
Does this dataset follow a standard financial reporting format (e.g., GAAP, IFRS), or are there custom adjustments that should be noted?
You can see how different this output is from what it had initially generated.
What I have realized is that ChatGPT knows how to talk to ChatGPT (prompting) way better than I know how to talk to ChatGPT, so why not have it come up with the prompt
So how did this all end up -
Here is a video where I go through this entire process.
Here is the agent in Agent.ai that you can test.
This is the dataset from Kaggle that I used.
Here is an output that I get when analyzing website visitors (uploaded as CSV).
So my call to action for anyone who has got this far is to go sign up for Agent.ai and start building some agents using Vibe Prompting. What are you waiting for? ChatGPT will generate all your prompts for you.
SAP incubated - Gen AI Founder, Thought leader, Speaker and Author
3 天前This is awesome!
Love this trick—ChatGPT fixing prompts is a time-saver! We see the same power in optimizing #webapps and AI for businesses. Can’t wait to see what Agent.ai has in store!
Founder and CTO at HubSpot. Helping millions grow better.
5 天前We are thinking about ways to integrate that LLM-based "iteration" right into the Agent Builder. I can imagine when adding an LLM action to an agent, we could provide an option to review/suggest improvements to the prompt, and provide one click access to "apply" the suggestion to the prompt. Saves going to separate tools and some copy/pasting.
Senior Software Engineer @ Avalara | Ex - Urban Company, NTT DATA
5 天前This has been a gamechanger for me as well. Rather than trying out different things it easier to understand what the llm thinks about your prompt and iterate on top of that.
Advisor & Helper. Ex-HubSpot.
5 天前Super smart + natural (for our brave new world, at least)!