GenAI app design flow and creating a dataset
This article is second in the series Creating GenAI Advisor. In the last article I identified the problem to solve and suggested six problem discovery patterns. Now I am going to share techniques to achieve an app design flow to rapidly prototype the solution and iteratively scale into a production ready system. This article will then expand on one of these techniques to create a custom dataset for the GenAI Advisor app.
App design flow
Here are a few techniques which I will use to create my GenAI Advisor app flow. I hope these will help me deliver results rapidly yet scale elegantly over time.
Awesome list into dataset
When creating a GenAI app, I find it useful to start with an inspiration of a human expert created artifact which I want the AI to consume or generate. Studying the ideal artifact can help me, as an example, clone certain sections into the desired prompts which might generate similar results using an LLM.
Such artifacts could include reports, documents, databases, transcripts, or workflows and relate to the problem domain I have discovered in the prior step. I am building GenAI Advisor and the problem domain includes research papers, analyst reports, among others. At the back of my mind, I have a related, more ambitous goal, to create a dataset for my GenAI app, which I can use for retrieval, fine tuning, model evaluation, or continued pre-training as the need may arise.
Starting with a simple list
So where do I start? How do I know if I am starting with the best artifact in the first place? Maybe, a good place to start is in the middle, not a specific artifact, not an entire dataset, but a simple list of types of data sources I will need.
If I am familiar with the problem domain where I have experience in performing the task, then I usually have a goto list handy. As I monitor technology and business trends for my work, I do have a list in this particular case.
You will notice that my list requires varying number of steps to get to the actual artifact reporting on trends. For example, I need to further drill down and list top analysts, then go to their website or blog to get the report which I monitor. However, in case of ChatGPT or Bard I may get the trends or insights I need in a single prompt. So my list needs a hierarchy, prioritization, and categorization of some sorts to be even more useful. I will park that thought for now.
I also make a mental note that this list in itself may lead to identifying the data sources I want to connect for retrieval augmented generation (RAG) in GenAI Advisor. So in some ways this is a list of microfeatures for my app. Worth the time spend in iterating through this.
So, back to the list. When I am new to the problem domain, I usually seek advice from experts, follow influencers, or search for "awesome list of X" on GitHub and Web in general. I am pretty confident I have a good list nailed down so I won't do that in this case.
Generating ideas using chat playground
This is also a good place to query my favorite LLM for ideas. I like using Amazon Bedrock Chat playground for generating ideas as it gives me options to (1) select and compare responses from multiple LLMs including proprietary Claude and open Llama, (2) configure the model response based on my needs. For example, in this case I went with Claude 2.1 and default configurations for Temperature (generates more creative responses), etc. I did change the length to 1000 tokens to ensure all 20 rows of the table are returned.
Note that I am using a simple but specific prompt to generate the response I desire.
Prompt: Create a table with columns for 20 types of sources of GenAI trends and description.
Here is the complete table generated by Claude. There are some new and interesting trend sources not on my goto list, like job postings, hackathons, web traffic, and search autocomplete. I may want to add these to my list.
领英推荐
Challenges in generating recent data
While generating ideas for trend sources is relatively easy, using an LLM to list actual reports is way more tricky. The reason is LLM knowledge cutoff date. Trend sources can be generated from LLM's parametric knowledge based on pre-training data. Listing latest reports and links requires LLM to use agentic capabilities to call a search tool and then generate a response. If the search results are not deep enough (number of pages of search, crawling deep links, searching within content) then the context available to the LLM will be limited which in turn will generate sub-optimal responses.
ChatGPT 4 does this with Bing search integration, however there are several limitations even with a simple prompt like this one.
It took me 10 attempts at prompt variations to get the desired results. Asking for more than a few columns (like description) in same prompt resulted in "I can't generate..." response. All responses resulted in less than 10 reports. Links returned were not links to reports, instead to source website. And so on. Even this 10th attempt had issues like, I don't know why SC Media is included. One of the primary reasons, why the results are sub-optimal, is that the quick search used by ChatGPT is not deep enough. Yet, the response takes some time to generate.
Extracting a feature for GenAI Advisor
However, this quick experiment gives me a cool feature idea for GenAI Advisor which may address some of the limitations observed.
Feature: GenAI Advisor can use an LLM to continously update its retrieval knowledge base of analyst reports, papers, etc. based on a human curated dataset of trend sources. For now I will call this feature Source Curator.
This feature validates our journey so far - the trend sources list is important. With this feature in mind, now is a good time to iterate further on the trend sources list and start maturing it into a dataset.
Maturing the dataset
As I think about maturing the trend sources list, my first step is thinking through how I will classify and prioritize the trend sources as well as the artifacts these sources deliver. This is important to prioritize microfeatures for the GenAI Advisor. I won't add all the trend sources at once to the app. How do I know where to start? What is the 80/20 rule here? I need a taxonomy to mature my list into a dataset.
To create the taxonomy, I think about how I use the sources myself. What do I value the most. Trust trumps everything else. Have I come to trust the source more than the others? Trust has priority levels - Origin through Mixed. I use numeric order to indicate high (1) to low (6) priority order. Next, I value Complexity (or ease), Accuracy, Frequency, and finally Quantity (density). I debated between Accuracy and Complexity, what comes first. For me, if I do not understand the data source easily (example, technical papers), I will not be able to assess its accuracy anyway, so complexity before accuracy.
Now that I have the taxonomy, I classify and prioritize my goto list like so. You will note that it is not a perfect multi-column sort, but you get the idea where I am going with this. Certain rankings were obvious, like industry leader content or events. Certain new ones came surprisingly high for me like job postings and Statista. It is also interesting that ChatGPT and Bard are in the middle order, which indicates there is an addressable problem that GenAI Advisor can solve which is harder for generalized LLMs to satisfy.
Intuitively, this relatively simple taxonomy of five qualitative measures may also provide some side benefits. I want to explore if these can help in model performance evaluation, creating content sourcing guardrails, or controlling agentic actions. More on this in future articles. Another thought I parked earlier was to add hierarchy to my dataset so that I can traverse to the actual artifact (document, report, database) which my GenAI Advisor will consume or generate. I will explore solving this (or if it needs to be solved) in a future article.
Here is a summary of my progress so far. In the prior article I identified the problem to solve and identified six problem discovery patterns. In this article I introduced six techniques to create an app design flow. I applied one of these techniques to create a dataset for GenAI Advisor, starting from a simple list and using an LLM playground to generate ideas. I then introduced a qualitative taxonomy to mature the list into a classified and prioritized dataset which, among other benefits, will help me establish RAG sources for the GenAI Advisor app. Along the way I validated the GenAI Advisor idea, learning from limitations of state of the art LLMs, and extracted a differentiated feature for creating the Source Curator LLM.
The author writes about generative AI to share his personal interest in this rapidly evolving field. The author's opinions are his own and do not represent the views of any employer or other entity with which the author may be associated.
Thanks for reading the AI for Everyone newsletter. Share and subscribe for free so you don't miss the next article and also support my work.
Marketing Coordinator for ChatFusion @ ContactLoop | Elevating Customer Engagement with AI-Driven Conversations
9 个月Manav Sehgal Good share