An (AI) Diagram is Worth a Thousand Words
If like me you've been using Claude Projects for text analysis and generation, you will rapidly create multiple related conversations of prompts, with related artifacts numbering in the hundreds fairly quickly.
The current AI tooling is pretty weak when it comes to managing the work products, once they've sprouted up after a few productive sessions. There's a lot of work being done to make developers' experience better, with tools like Github Copilot, Cursor, CodeBuddy etc.
And last week Anthropic promoted its partnership with Notion to help provide an integrated experience to content creators.
But there's a long way to go due to limited integration between the front ends and what the models can do on the server. And in this article, I will give you an idea of how I try to orgaanize my pro-level prompting by extracting metadata using Claude, and generating documentation and process diagrams.
Let's Start at the End
I begin my narrative at the end of prior work which I won't go too much into - suffice it to say I did a lot of prompting, created a lot of artidfacts, refined the prompts, refined the artifacts, and now I want to wrap this up with an view to optimizing the overall information flow and implementing an automated process capable of doign the equivalent unattended.
The result of that work is that I had about 100 artifacts in multiple related conversation threads, with even more prompts. I had started going through each manually writing them down, with many hours of tedium ahead of me, but then - genius (or laziness depending on your point of view) stuck! and I decided to get Claude to do my work for me.
What Claude Knows
Each Claude conversation is kept separate from others, even within a Claude Project. The technical term for this is "bummer" and there's no function to search through the project to find particular conversations.
In fact it knows what's going on when pushed about it, but it doesn't have access to some pretty basic stuff:
It knows mostly stuff about the current conversation:
It can actually examine the conversation, going fowards or backwards, look for things, apply processes on each:
This is really all I need it to do - go through each conversation, find the interesting prompts, and gather them in a document.
Creating a prompt to gather the prompts
In order to create my documentation, I'm going have to go into each conversation and ask it go though it and find all prompts that produce artifacts, and list the inputs used to produce those artifacts. I can't prompt it to go though all the conversations, so I'll have to manually do that, but once in the conversation I'll issue my prompt and it will generate the document with all the other prompts.
This is what my first prompt did:
Please create a markdown artifact describing an artifact generation process. Go to the start of the conversation and procede prompt by prompt to the end. For each prompt that generates an artifact, quote the prompt, list the input files, and list the produced outputs.
This produces an interesting document including the actual quoted prompts:
In an oddly self-refrential way, it even includes its own prompt:
Improving the Prompts' Prompt
Not a bad start but since my goal is to avoid manual rework, I refine the prompt until I can use the documentation as is:
Please create a markdown artifact describing the conversation's artifact generation process. Use the subject of the conversation as the default name for the artifact and ask me for confirmation before proceeding, or if I would prefer to use a different name.
Start the artifact with the physical file name (no extension), then on the next line the name as Title. Start the physical file name with 'process-generate-' to group all the generation process files together
Go to the start of the conversation and proceed prompt by prompt to the end of the conversation, but skip the current prompt. For each prompt that generates an artifact, create a section with a name summarizing the prompt, then
1) quote the prompt in a code block,
2) list the input files either attached to this prompt or from previous prompts, using typical markdown syntax to link to them,
3) list the artifacts used, using typical markdown syntax to link to them,
4) and list the produced artifacts, using typical markdown syntax to link to them.
Then review the prompts to ensure they are unique. Through each prompt one by one, and if there is a prompt you suspect supersedes another one, please let me know and wait for my approval before proceeding. I trust you to tell if you find a prompt is unique, just move on to the next one in that case
Once all the prompts have been reviewed, if there were any superseded ones, please create a new normalized artifact generation process.
Running this generates a nicely formatted file without redundant prompts, ready to be used for further analysis. Now I only need to run it in each of my conversations to create an inventory of the prompts that I used, with the inputs and outputs produced.
Creating a Mermaid chart diagram
Ever since ChatGPT 3.5 came out, people have been asking LLMs to generate Mermaid charts. You can create flow charts, Gantt charts, class diagrams based on simple text commands, which makes it perfect for LLMs. Even better, you can preview the chart directly in Claude as an artifact.
The actual format is intended to be scriptable by a human:
But it can get tedious.
After extracting the prompts and files/artifacts into several documents, I decide to ask it to generate a Mermaid chart to see the full prompting workflow visually:
And this gives me a Mermaid chart:
I decide to refine it to see the files and artifacts.
This is not that useful either, considering how cluttered and somewhat random the layout is. I asked it to simplify by not showing all of the files and artifacts, just a label describing them:
For some reason it introduced a pink color that was quite abominable. This was easy to fix, but there seemed to be stuff missing. I asked it to go through the file step by step again and got this:
Very cool looking, but completely wrong - somehow it came up with various "logical" connections that were not actually present in the inputs.
This shows that I am probably stuffing too much into a single prompt. But also the Mermaid chart format has its limitations, so instead of continuing to work on just this one which would be fine for a simpler use case, I decide to try something more powerful.
Creating a diagram with Draw.io
I decided to look for a tool with more control, but that still had an XML representation. I didn't want to use Visio because I can't be bothered to pay for it, I wanted something with a free tier. Perplexity helped me find Draw.io, which has many drawing capabilities and an XML representation.
Creating an Example
I started by creating an example of how I wanted the diagram to appear, using the visual editor in draw.io. I choose three types of shapes - yellow tear-off for outside files, green rectangle for processes, and blue hexagon for artifacts created by Claude in the Claude Project. The arrows would show the direction of data flow.
This diagram is descibed with the following XML:
Describing a Valid XML Structure
Unfortunately, as frequently happens, the application XML format was undocumented but this is no problem as I regularly spelunk into unknown formats using my AI buddy, as previously described.
First thing was to upload the file into Claude and get it to describe the structure using a standard XML Schema Definition
The XML Schema is actually another XML - and not that user-friendly, but just fine for an LLM or other software that needs to understand document structure:
I added the schema to the Project Knowledge. Of course, this schema is just a subset of the complete file format, based on the example I was interested in. However this does just fine since I'm going to be creating new files that match so the rest I don't need.
If I weren't using XML, I probably could get it to just use the example file as starting point, but part of me is paranoid that it might be unduly influenced by the content of the example file in terms of names or IDs so by extracting the schema I'm attempting to have it not be distracted by the example and hallucinate irrelevant diagram elements.
(Working with JSON structures, I would create Typescript types. If I were generating SQL, I would create table definitions. If I were working with COBOL, I would create a "copybook" etc.)
First Attempt: Generating From the Schema
At first, I tried generating the diagram from my schema.
The result wasn't bad - however the way the connections' attachment points weren't all that good, so that even playing around with the automatic layout, I couldn't get a clean diagram without a lot of mousing around.
I decided to refine my prompt to get the connections to be a bit better. I had to refine it a few times because it generated some small issues that prevented the application from loading e.g. all the values of the id attributes were blank, a previously cached version of the XML was being used, etc.
The results were disappointing: many items missing, ugly-looking content. By this time I had started a new conversation because the previous one was too big and running out of tokens. In the new conversation, I was just relying on the XML Schema so it lost some of the information from the original example and gave me something like this:
领英推荐
Second attempt: Claude Writes a Prompt
I decided to ask Claude to generate a prompt to do this instead of my fooling around in the dark. First, I asked it to create a prompt to reproduce the example I had started with. This is what it gave me:
# Prompt: Generate a draw.io XML Structure
Create an XML structure for a draw.io file that represents a simple process flow diagram. The diagram should include the following elements:
1. Two input documents labeled "Input 1" and "Input 2"
2. Two process boxes labeled "Process 1" and "Process 2"
3. Two artifact hexagons labeled "artifact 1" and "artifact 2"
4. Arrows connecting these elements to show the flow
Follow these specifications:
- Use the `mxfile` element as the root, with attributes for `host`, `agent`, and `version`
- Include one `diagram` element with a name and id
- Use `mxGraphModel` to define the graph properties
- Represent each shape and connection with `mxCell` elements
- Use appropriate styles for each shape type:
- Documents: `shape=document;whiteSpace=wrap;html=1;boundedLbl=1;`
- Process boxes: `rounded=1;whiteSpace=wrap;html=1;`
- Hexagons: `shape=hexagon;perimeter=hexagonPerimeter2;whiteSpace=wrap;html=1;fixedSize=1;`
- Set different fill colors for each shape type:
- Documents: `fillColor=#fff2cc;strokeColor=#d6b656;`
- Process boxes: `fillColor=#d5e8d4;strokeColor=#82b366;`
- Hexagons: `fillColor=#dae8fc;strokeColor=#6c8ebf;`
- Use `edge="1"` for connection arrows
- Set appropriate `geometry` for each shape, defining its position and size
Ensure that the XML structure is well-formed and follows the schema we defined earlier. The resulting XML should be a valid draw.io file that, when opened in draw.io, displays a clear and organized process flow diagram.
Now I could ask to generalize this prompt.
And it gave me the following:
# Prompt: Generate a Generic draw.io Process Flow XML Structure
Create an XML structure for a draw.io file that represents a process flow diagram. The diagram should include input documents, process boxes, and artifact hexagons, with arrows connecting these elements to show the flow. Use the following information provided separately:
- Number and labels of input documents
- Number and labels of process boxes
- Number and labels of artifacts
- Flow connections between elements
Follow these specifications:
1. Root Structure:
- Use the `mxfile` element as the root, with attributes for `host`, `agent`, and `version`
- Include one `diagram` element with a name (e.g., "ProcessFlow") and a unique id
- Use `mxGraphModel` to define the graph properties
2. Shape Representations:
- Represent each shape and connection with `mxCell` elements
- Use appropriate styles for each shape type:
- Input documents: `shape=document;whiteSpace=wrap;html=1;boundedLbl=1;`
- Process boxes: `rounded=1;whiteSpace=wrap;html=1;`
- Artifacts: `shape=hexagon;perimeter=hexagonPerimeter2;whiteSpace=wrap;html=1;fixedSize=1;`
3. Color Scheme:
- Set different fill colors for each shape type:
- Input documents: `fillColor=#fff2cc;strokeColor=#d6b656;`
- Process boxes: `fillColor=#d5e8d4;strokeColor=#82b366;`
- Artifacts: `fillColor=#dae8fc;strokeColor=#6c8ebf;`
4. Connections:
- Use `edge="1"` for connection arrows
- Ensure that source and target attributes correctly link the shapes
5. Layout:
- Set appropriate `geometry` for each shape, defining its position and size
- Arrange input documents on the left side of the diagram
- Place process boxes in the center
- Position artifacts on the right side
- Adjust positions to accommodate the number of elements and maintain clarity
6. Naming and Labeling:
- Use the provided labels for each input, process, and artifact
- Ensure that the `value` attribute of each `mxCell` contains the correct label
7. XML Structure:
- Ensure the XML is well-formed and valid according to the draw.io file format
- Use appropriate indentation for readability
The resulting XML should be a valid draw.io file that, when opened in draw.io, displays a clear and organized process flow diagram reflecting the provided inputs, processes, and artifacts.
Note: This prompt assumes that the specific details about the number and labels of inputs, processes, artifacts, and their connections will be provided separately. Adjust the generated XML structure accordingly based on this information.
The first attempt wasn't bad but it only generated about 10% of the full diagram. I added the following at the end:
MPORTANT: GO THROUGH THE ENTIRE ARTIFACT AND DO ALL PROCESSES; DO NOT SKIP ANY PROCESSES, INPUTS OR OUTPUTS!
Still no improvement. I added:
I WILL BE VERIFYING!
This worked but started generating an enormous amount of connections, as if it were overcompensating and wanted to connect everything to everything:
Sidebar: Motivation and the LLM
The threat of verification sometimes seems to help generating a long answer. Now as I have written elsewhere, I don't hold the belief that LLMs are conscious entities trapped in silicon. If one day science can tell what consciousness is and prove that the LLMs have it, then I'll go along with it. But for the moment I put my faith in Occam's razor and believe that we are dealing with a simulation of human thought and communications.
So it can be tempting to scoff at adding phrases that would imply some sort of emotional response from the LLM. Or some superstition on the part of the Prompter.
Nevertheless, part of this simulation which is LLM text completion, is the presence of human response patterns baked in through Reinforcement Training using Human Feedback (RTHF). This is a process where human trainers judge different answers according to what humans prefer. Perhaps the human judges were fine with partial answers, having gottent the gist of where the LLM was going, rather than having to wait for the full output to be prepared.
This seems to give LLMs a tendancy towards cutting off the data in the case of long answers, leaving the rest for us to do. Basically TLDR cranked out by trillions of computations.
Sometimes that's fine and we don't actually want 100s of lines of output (which consumes our tokens!), we just want to get the gist of things. But other times we do want the full thing.
Hence the utility of the threat of being caught out by the human if things are left out to let the LLM know what the situation is, so it knows if it plays the part of the blasé LLM that assumes you're a man or woman of the world and can recognize the pattern after the first few iterations, or if it needs to be more of the accountant type that will hound you for being $0.50 off in your totals.
Interactive Verification Pattern
In fact, we can also add a step in the prompt to check things one by one and wait for your approval - this actually makes the verification thread a real thing as you will now go through each of the items and approve of the proposed processing. Another variant is to have it go through this process but only stopping if there's something wrong.
For example:
If in doubt, you can have it print something for each item it skips over. You can have the ouput formatted and placed in an artifact. With some minor adjustments, I used this a few times to cross-check that my unit tests were all coherent with specifications, and that the specifications covered all of the code in my classes.
I'm not sure why this pattern works so much better than when making it process the same inputs silently - maybe something to do with parallelism? It resembles the problems LLMs have counting letters in words, when they don't first consider the word as an array of letters so as to better traverse it.
Creating a Hybrid Prompt
As is often the case, I started by playing around with my prompts, then I asked Claude to create some improved prompts, and then I was back to putting in a little human ingenuity.
I guessed that the either underwhelming or overwhelming results were due to some confusion in terms of how the processing should be going.
Looking at the beginning of the prompt generated by Claude, we humans understand what needs to be done:
Create an XML structure for a draw.io file that represents a process flow diagram. The diagram should include input documents, process boxes, and artifact hexagons, with arrows connecting these elements to show the flow. Use the following information provided separately:
Number and labels of input documents
Number and labels of process boxes
Number and labels of artifacts
Flow connections between elements
But there was something about this combination of directives that the LLM didn't get - even though it was written by the LLM itself!
One thing to beware of is that LLMs answer questions confidently about their processing even though in reality they don't have any actual information about it. Hence no need to be overly surprised if what it tells us LLMs like isn't really true. It's not always wrong, but not always right either.
I decided to do a few things:
Here was the result of my cleanup attempt:
Go through the processes summary artifact and go through the processes one at a time.
For each process, create a process box.
Go through its inputs.
For each input file, create an input shape unless it was already created for a previous process. Create a connection arrow from the input to the process.
For each input artifact, create an artifact shape, unless it was already created for a previous process. Create a connection arrow from the input artifact to the process.
For each output artifact, create an artifact shape, unless it was already created for a previous process. Create a connection arrow from the process to the artifact.
Unit of Traversal Focus
When faced with a situation where not all of the input is being processed, we can use emphasis and threats in all caps to convince it not to leave anything out. We should also make it easier for it by telling it how to break down the input and ensure it traverses all of it, at the right level of granularity.
When we don't do that and we just casually ask it to read the full input, the LLM acts as a speed reader and picks out interesting or randomly sampled elements it finds representative, based on its Attention weights.
The result is that you end up getting an output that just features a small number of sample items, with a comment that it's leaving the rest of the exercise to the reader, as per any good Math book. However this is not so useful so we need to make it understood that we want the full content.
It seems to be much more successful when we define what division of the input should be its focus when traversing, then it give rules to apply for each of those focused items.
For example, if the input is a text file, you can tell it to traverse line-by-line and apply various rules on each.
In the case of the prompting process documentation that I was using, the inputs were more complex paragraphs of various facts that went together to define each process. Each explained what prompt to use, on what piece of information, with what outputs. So processing line-by-line would not be a complete enough unit of data to focus on. That's why I told it to traverse each process found in the input.
The result was still incomplete. It appears the instructions combined too many levels of processing to be clear. At that point I took a step back and decided to simplify the process by driving all of the types of objects separately, and using the previous steps to make this one simpler.
Preparing the Data with Claude
Considering the lack of success, I decided to divide the problem further and to first convert the process documentation into tables that could likely be more easily converted into diagrams. I figure this is also a way to check the validity of the processing part-ways which having it all in a single prompt was difficult to do.
I used a new prompt to convert the free-form documentation into tabular form, focusing on the connectivity between files, artifacts and processes:
In fact this created one artifact that had two tables, one with the files and artifacts (created by Claude):
And one with the processes themselves, sometimes reusing the same files or artifacts:
Revisiting the Focus of Traversal
Now each line of the input contained a complete enough unit of information that the LLM could work on.
I reworked the traversal, focusing on the top-most process first of laying out the files, processes and artifacts objects onto the diagram.
Go through the processes summary one line at a time.
- For each input file, create an input shape.
- For each process, create a process shape.
- For each artifact (input or output or both), create an artifact shape.
Then it could take care of the connections. I could have it traverse all of the processes, and connect to the files or artifacts in each process' row in the input table. (There is some resemblence with what our earlier prompts were trying to do, but by first creating the more structured tables of information, the traversal of the processes seems much more easily accomplished):
Go through all the processes. For each process, do the connections:
- Go through its input files. For each input file, create a connection arrow from the input file to the process.
- Go through its input artifacts. For each input artifact, create a connection arrow from the input artifact to the process.
- Go through its outputs. For each input artifact, create a connection arrow from the process to the output artifact .
The rest is just formatting which I left as is.
The full prompt:
Go through the processes summary one line at a time.
- For each input file, create an input shape.
- For each process, create a process shape.
- For each artifact (input or output or both), create an artifact shape.
Go through all the processes. For each process, do the connections:
- Go through its input files. For each input file, create a connection arrow from the input file to the process.
- Go through its input artifacts. For each input artifact, create a connection arrow from the input artifact to the process.
- Go through its outputs. For each input artifact, create a connection arrow from the process to the output artifact .
Follow these specifications:
1. Root Structure:
- Use the mxfile element as the root, with attributes for host, agent, and version
- Include one diagram element with a name (e.g., "ProcessFlow") and a unique GUID id
- Use mxGraphModel to define the graph properties
2. Shape Representations:
- Go through all of the processes one by one, its inputs file and artifacts, drawing each shape and connection with mxCell elements. If the input file or artifact was already drawn, just reuse it; use the file/artifact/process id as the id value, and for arrows concatenate the source and target e.g. id="input-1-to-process-2"
- Use appropriate styles for each shape type:
- Input documents: shape=document;whiteSpace=wrap;html=1;boundedLbl=1;
- Process boxes: rounded=1;whiteSpace=wrap;html=1;
- Artifacts: shape=hexagon;perimeter=hexagonPerimeter2;whiteSpace=wrap;html=1;fixedSize=1;
3. Color Scheme:
- Set different fill colors for each shape type:
- Input documents: fillColor=#fff2cc;strokeColor=#d6b656;
- Process boxes: fillColor=#d5e8d4;strokeColor=#82b366;
- Artifacts: fillColor=#dae8fc;strokeColor=#6c8ebf;
4. Connections:
- Use edge="1" for connection arrows
- Ensure that source and target attributes correctly link the shapes
5. Layout:
- Set appropriate geometry for each shape, defining its position and size
- Arrange input documents on the left side of the diagram
- Place process boxes in the center
- Position artifacts on the right side
- Adjust positions to accommodate the number of elements and maintain clarity
6. Naming and Labeling:
- Use the provided labels for each input, process, and artifact
- Ensure that the value attribute of each mxCell contains the correct label
7. XML Structure:
- Ensure the XML is well-formed and valid according to the draw.io file format
- Use appropriate indentation for readability
The resulting XML should be a valid draw.io file that, when opened in draw.io, displays a clear and organized process flow diagram reflecting the provided inputs, processes, and artifacts.
Because Claude has an output limit, only generate 100 lines at a time and give a sequential name to each fragment. Check the last line of the previous artifact and make sure not to generate that same text a second time - keep going from beyond the last previously generated character
And now - the full diagram created in all of its gory detail:
Value Created
I probably used a thousand words to create this final diagram. However, next time, I'll just reuse my prompts and have it done in under a minute.
I used Claude's ability to introspect on its prompts as a way to help me manage my Gen AI work. I expect with more usage and feedback, we should be getting better introspection and better intergration, which will make the Gen AI workflow even more productive.
(I actually created a script with Claude to package up files and transfer them back and forth between Claude Projects and my computer - and saw quite a few other programmers had done something similar)
I also find that, in order to achieve more work such as extracting processes and representing them as usable diagrams, some investigation is required which may be more involved than most people would be willing to do, either because it requires time people don't have, or some technical knowledge to guess at what solutions might work, and overcome formatting errors for which the only application message may be: "Invalid File Format".
However, as those solutions become packaged and made avaialble as actual tools we can expect that, whatever the philosophical underpinnings of LLMs, knowledge workers will want and need to use those tools instead of doing the work manually.
Imperfect as these tools are today, the extensions it gives knowledge workers cannot be denied. Without falling prey to the Fear of Missing Out, it's definitely worth your while to learn more about it.
As consulting coder and a daily user of Gen AI, Martin Béchard enjoys learning and sharing new powertools for knowledge work.
Expert in Integrated Marketing, Digital, UX, Content, Brand & Product | Proven Success in Data-Driven Campaigns
1 个月Very helpful, Martin Bechard.
? Generative AI Strategist | Process Maven | Market Leadership Architect | USAF Veteran ?
1 个月I was shocked to discover last year that most LLMs were fluent in Mermaid. I'm happy to see how Martin Bechard has taken this so much further in this article. Recently, we've seen LLMs generating podcasts and more. Here's invideo 's interpretation of this article! https://ai.invideo.io/watch/z-tgeEPYwgw