Using ChatGPT to Explore Claims Databases
Incorporating ChatGPT into products and workflows has become almost indispensable. Of course, the irony is that the same technology enhancing our capabilities may, in the future, automate certain aspects of our roles. In the spirit of this paradox, I present an agent powered by ChatGPT for the analysis of medical claims data.
20 Years of Medicare Part B Claims Data
Each year, the Centers for Medicare & Medicaid Services (CMS) releases aggregate data detailing the procedures performed by healthcare providers and the corresponding reimbursements. This trove of information arrives in the form of Excel or CSV files with varying formats. For analytical purposes, I've diligently converted and stored these datasets in a comprehensive SQL database, augmented by metadata for each code. This database powers a dashboard that offers high-level insights. This month marks a significant milestone, as the database now encapsulates two decades' worth of data.
Sometimes I get a question for which the dashboard doesn't answer and then I have to query the tables directly. For obvious security reasons, I haven't really opened up SQL queries to users of the dashboard. So in a move that provides both usability for those who don't think in SQL or don't know how I have structured my tables, as well as to make it flexible enough to accept any query, I tried out using OpenAI's ChatGPT API and here's what I am learning.
Building an Agent for Analyzing Medical Claims Data
To keep it simple, I created a command-line interface where I ask a question and get an answer. I am using Python with the OpenAI packages and LlamaIndex's SQLDatabase service to query structured data. For this, I am using GPT-4, and I can say categorically that GPT-3.5 did not produce usable results. For the curious, I'll post the basic code below.
First, just for a use case, you can see how this works from a command line, where I am asking a question in natural language, and behind the scenes this is being turned into SQL that can reasonably be executed on the tables that I've given ChatGPT access to.
OpenAI is not ingesting my data in any way and is not actually doing any processing on the data directly. It is simply generating and executing SQL code on my own server. The code I wrote also returns the SQL that ChatGPT comes up with so I can evaluate if its approach to the data makes sense.
Looking at the Results
A few observations from my initial alpha:
Here are examples of questions that got good answers:
Technical Background
I'm sure you are curious how this is put together. So here's a simplified version of the code that I'm using.
os.environ['OPENAI_API_KEY'] = '<open-ai-key>'
openai.api_key = os.environ.get('<open-ai-key>')
connection_uri =
f"mysql+pymysql://{db_user}:{db_password}@{db_host}/{db_name}"
engine = create_engine(connection_uri)
llm = OpenAI(temperature=0.5, model="gpt-4")
query_engine = NLSQLTableQueryEngine(
sql_database = SQLDatabase(engine),
tables = ["CMSB", "hcpcs_codes"],
synthesize_response = True,
service_context = ServiceContext.from_defaults(llm=llm),)
try:
response = query_engine.query(question)
print("\033[91mANSWER:\n\033[0m" + response.response + "\n")
print("\033[91mSQL:\n\033[0m" + str(response.metadata) + "\n")
except Exception as ex:
print("Error " + str(ex))
In terms of the database, here is the schema. There are quite a few more tables for finding provider-level data but in the interest of incremental gains, I just turned ChatGPT to the aggregate data:
mysql> desc CMSB;
+--------------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+-------------+------+-----+---------+-------+
| year | int | NO | MUL | NULL | |
| hcpcs_code | varchar(8) | YES | MUL | NULL | |
| modifier | varchar(15) | NO | | NULL | |
| no_of_services | int | YES | | NULL | |
| allowable_payments | int | YES | | NULL | |
| total_payments | int | YES | | NULL | |
+--------------------+-------------+------+-----+---------+-------+
6 rows in set (0.00 sec)
mysql> desc hcpcs_codes;
+---------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+-------+
| hcpcs | varchar(8) | YES | | NULL | |
| description | varchar(250) | YES | | NULL | |
| drug_ind | char(1) | YES | | NULL | |
| activity_type | varchar(45) | YES | | NULL | |
| patient_type | varchar(45) | YES | | NULL | |
+---------------+--------------+------+-----+---------+-------+
5 rows in set (0.01 sec)
I think one of the things to consider is to add some preamble context. I did try using SQLTableRetrieverQueryEngine instead of NSQLTableQueryEngine, but that increased query time substantially without gains in accuracy.
I also added some preamble directly to the question, e.g. "You are a data analyst providing information on medical claims using two tables, ..." and this might have made some improvements, but it was hard to tell.
Overall, this is one of many experiments I am doing with GPT-4 and both LangChain and LlamaIndex. My results with GPT-3.5 a few months ago were lackluster, particularly as I couldn't download my instinct for specific clinical contexts.
I have kept my dashboard unchanged for now, but ...
What do you think? Is chat the right kind of interface for getting insights from medical claims databases?
*Note that CPT codes are copyright 2023 by the American Medical Association and a license is required to sell or distribute codes.