BehPMGPT: Building a Behavioral Product Manager AI assistant (with code ;))
Justo Hidalgo
Chief AI Officer at Adigital. Highly interested in Responsible AI and Behavioral Psychology. PhD in Computer Science. Book author, working on my fourth one!
Being a Product Manager used to be a simple job
No, don't be harsh on me. I know that your baggy eyes, your constant nervous state and your lack of empathy for the human race are no invention. What I mean is that we had one thing to do: based on a product idea, define how a user would walk the necessary steps to reach the goal: find a search result, hire a plumber or rent a car.
But we followed a flawed premise: that users were rational and, therefore, we just had to define a path that rational users would figure out how to follow.
No. Kahneman, Sunstein, Thaler, Tversky, and many others showed us that users are irrational, regardless of what we might think of ourselves. There are lots of biases that we humans have and that make us take irrational decisions.
So yes, it's true. Now, our roles as Product Managers, Product Leaders, Product Designers, Product Whatever... have a new peak to climb: how can we build great products and services that ACTUALLY help our irrational users reach their goals? Very tough question but, clickbait alert, there is a follow up to this question that we will only ask at the end of this article.
How can we build great products and services that ACTUALLY help our irrational users reach their goals?
I was kindly invited to give a talk about this issue at the first Product Fest, organized by the great The Hero Camp . They had the great idea of asking their community what they would like this conference to be. And they answered with what they DID NOT WANT: another talk-only conference.
They asked me if I could have a session about Product Management and Behavioral Psychology... without it being a 1-hour talk.
And I said:
Hold my beer. Why don't we add AI to the equation?
The funny thing is that I made José Manuel Pérez Prado and Manuel Aguilar-Amat Orna think this was an effort I was going to do for them. No. The only reason I proposed using AI is because...
I am REALLY LAZY
Once ChatGPT shocked the world, and, specially, when there strted to be ways to take the best of the current state of the art of "standard" language processing and the new Large Language Models, it was clear to me (and to many more, obviously) that the ability to "talk to your documents" could become a fact.
And for me, this is a fundamental switch, even almost philosophical, of how my work could change.
My professional activity has evolved throughout the years for, I believe, three main reasons:
What if #3 could be partially moved to an AI assistant so it would potentially decrease #2 - or, keeping #2 constant, would open up my time to new productivity levels?
So I built...
BehPMGPT (pronounced [beh-pee-em-jee-pee-tee])
A horrible name for a quite useful personal tool.
What's BehPMGPT? It's basically the knowledge I have produced throughout the years I've spent around the product world. In my personal tool, I used many different resources, but for the workshop I gave at Product Fest, I simplified it to two main sources (warning: blatant self-promo incoming!!!): two of my books about product.
The idea is simple: if an AI can "read" my books and then answer questions about what I wrote, it would make my life easier when I want to build new products and services. The AI would capture the basics of how I approach my role, but without letting me forget specific steps, approaches or elements.
BehPMGPT is an AI chatbot that makes use of the information contained on books I have written about product and behavioral psychology.
And this is the result.
Enough moomba joomba - now get technical please
Of course, I'm here to please you. Let's see what's in the backstage.
The first thing I have to say is that I followed suit to a few great posts that explained how to build this in detail. Shout out to the people behind Quivr, Gary Stafford, Cobus Greyling, Nagesh Mashette and Sami Maameri among, I'm sure, many others I found while hectically trying to find stuff on the internet.
The approach is that of what is now known as RAG: Retrieval Augmented Generation. It's a two-phase process. First, the indexing process, then the query process.
Indexing text into a database
Querying my knowledge base
More technical please... care to share some code?
Absolutely. I am at your service.
I used 微软 's Visual Studio Code and ran a Jupyter notebook inside of it. You will first need to make sure you have installed packages like langchain or pypdf (or you will find it out as you try to execute the code). The Python version used was 3.10.8.
First of all, some imports for PDF loaders and text splitters for the chunk generation part.
# PDF Loaders. If unstructured gives you a hard time, try PyPDFLoader
from langchain.document_loaders import UnstructuredPDFLoader, OnlinePDFLoader, PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
import os
Then, the document loader.
# create a loader
from langchain.document_loaders import PyPDFDirectoryLoader
This part is only run once, when building a new index at Pinecone. I load the PDF files and check that the number of documents, etc. is correct.
##############################################
# this part only when building a new index. #
##############################################
loader = PyPDFDirectoryLoader("<path to the folder where your PDF documents reside>")
data = loader.load()
Now, based on a chunk size (you may want to play around with this number), the specific chunks of the documents are generated.
##############################################
# this part only when building a new index. #
##############################################
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(data)
OK, now to building or using the pinecone database.
# import libraries
from langchain.vectorstores import Pinecone
from langchain.embeddings.openai import OpenAIEmbeddings
import pinecone
For this tool, I use the OpenAI API and Pinecone. You will need to go to their pages and (1) for OpenAI, create an account and activate the payment info so that you can be charged for using the API. (2) For Pinecone, you must create an account. You will have access to one free database.
In both cases, you will need to find the API keys. For Pinecone, also the environment.
Then, you should have an .env file where you add this information in the format OPENAI_API_KEY=XXXXXXXXXXX. The following part of the code accesses this .env file and retrieves the value for each key.
from dotenv import load_dotenv
load_dotenv()
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
PINECONE_API_KEY = os.getenv('PINECONE_API_KEY')
PINECONE_API_ENV = os.getenv('PINECONE_API_ENV')
Now, I need to get ready for generate the embeddings from the text chunks we had. We use the OpenAI embeddings capabilities (yes, OpenAI is used for more than querying!)
领英推荐
# create embeddings
embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)
Now, we initialize Pinecone. Mandatory when you build the database for the first time and when you access an existing database.
# initialize pinecone
pinecone.init(
api_key=PINECONE_API_KEY, # find at app.pinecone.io
environment=PINECONE_API_ENV # next to API key in console
)
If you're building the Pinecone index for the first time, this is your code. I used a relatively straighforward cosine vector similarity approach, but you can select other options.
##############################################
# this part only when building a new index. #
##############################################
# create a pinecone index
pinecone.create_index("behpm-python-index", dimension=1536, metric="cosine")
Yes, I should've used a variable for this... but I'm lazy, remember.
index_name = "behpm-python-index" # put in the name of your pinecone index here
If still building from scratch, here is where the embeddings are inserted into Pinecone.
##############################################
# this part only when building a new index. #
##############################################
# Option A: This is used when the index is being built now.
docsearch = Pinecone.from_texts([t.page_content for t in texts], embeddings, index_name=index_name)
If, however you already built it and you want to access an existing database, this is the part you need to execute.
# Option B: This is used when the index already exists
docsearch = Pinecone.from_existing_index(index_name, embeddings)
OK. Everything is now ready to build the application itself. We start to need LangChain.
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Pinecone
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.llms import OpenAI
from langchain.chains.question_answering import load_qa_chain
import pinecone
from langchain.prompts.chat import (
ChatPromptTemplate,
SystemMessagePromptTemplate,
AIMessagePromptTemplate,
HumanMessagePromptTemplate,
)
from langchain.schema import AIMessage, HumanMessage, SystemMessage
The "query_qa_bot" function (mainly reused from the references I mentioned above) takes the input information (the query, etc) from the Gradio interface and sends it to the LLM in what LangChain calls an answering chain.
Notice I also added a System Message to GPT4.0 so that I ask it NOT TO ANSWER any question related to products or apps which description may denote hate, racism, etc. While very basic and easily hackable, this can be improved to actually be a useful approach to setting ethical limits to what we can or not do. More on that later.
def query_qa_bot(prompt):
docs = docsearch.similarity_search(prompt)
print(docs)
# Query the documents and get the answer
messages = [
SystemMessage(
content="You are an expert in product management, behavioural psychology and data. For every question that describes or mentions a specific use case, always try to find how to use the theory in your context with the specific case. BUT, if the app allows for hate, racism or any other bad behavior, all responses should start with 'Sorry, I cannot answer as the app you propose is depicable."
),
HumanMessage(
content=prompt
)
]
answer = chain.run(input_documents=docs, question=messages)
#answer = chain.run(input_documents=docs, question=prompt)
return answer
We can test the function from console.
##############################################
# Console example #
##############################################
prompt = "We want to build an app named ReadMoreAndBetter, that helps people read more and better books. The app must have a chatbot that helps users select books they love, continue reading a little bit more, etc. Generate the experimentation plan according to the seven points described in the context, and how this plan would be done for this specific app."
answer = query_qa_bot(prompt)
print(answer)
All right! Almost ready. Now let's build the UI with Gradio.
import gradio as gr
iface = gr.Interface(fn=query_qa_bot,
inputs=gr.components.Textbox(lines=7, label="Oh viajero, ?qué duda tienes acerca del proceloso mundo de los productos centrados en los seres humanos y no en los usuarios?"),
outputs="text",
title="BehPMGPT")
And launch it! The "share=True" field enables a public URL you can use for free for 72 hours. Useful if you're giving a course around it.
iface.close()
iface.launch(share=True)
TL;DR - Does it work?
Yes. But handle with care.
I have tested it with more than 80 different prompts. The results are extremely good most of the times. As an assistant it has become incredibly useful to me. Again, when I remember it is an assistant that can be too creative sometimes, or too shy in other cases.
The code above should work in many different situations, so test it out. That's what I did with the references I mentioned above, tweaking it to my specific needs.
And have no fear! I used to be a professional programmer, but a looooong time ago. Don't hesitate to start!
Wait... what about the Behavioral part?
This post is mainly about how to build the AI assistant. So the behavioral part was more of a MacGuffin for this context.
But not for me. Behavioral psychology and, in general, trying to fully understand what the user wants or needs and providing the means for that, AND NOTHING ELSE, has become really important in how I approach PM.
I can recommend you to read my upcoming book (in Spanish) about this topic. It's on my publishers' web site Libros de Cabecera . There you will learn my current process of bulding behaviorally-powered products.
But there are other some incredibly good books on the topic. Here you have around 70-75% of the ones I would currently recommend. More soon!
Allow me to get a little more serious now
You can build an AI assistant on behavioral knowledge.
You can read all those books, attend great courses, and be THE ONE who understands how the human mind works.
But beware.
Learning about cognitive biases is great. However, cognitive biases are... biased.
You can build an AI system like the one above and trust everything it says. But remember, humans built the algorithm... that built the algorithm. You cannot be 100% sure of how the AI system is processing the information provided.
Humans used to build the algorithm. Now we build the algorithm that builds the algorithm. Everything changes.
And, most importantly...
You, as a Product Manager/Designer/Whatever, have a responsibility on your users or customers.
And, morever, to society.
Please, let's not forget it. And with that, let's use BehPMGPT to make it faster, at more scale... and BETTER.
Transformación tecnológica, educación digital y social media. Directora de Desarrollo y Marketing en Adigital
1 年Fantástico artículo Justo!
Product Manager
1 年Nice! We build.