登录查看更多内容

Example: Using LLMs to Classify Email Bounces

Warren Hearnes, PhD

Founder, OptiML AI & CEO, Sportslytx | AI+ML+Optimization Leader | Veteran | Chief Analytics/AI/Data Science Officer

发布日期: 2025年1月5日

This weekend I've saved myself significant work by using a local LLM to classify rejected email for a non-profit I work with.

The Background

In addition to my work with OptiML AI and Sportslytx , I also volunteer time for non-profits. One non-profit I have worked with for many years (let's say it is domain @example-non-profit.org) provides members with email addresses like [email protected] or [email protected]. These can be FORWARDED to their main email, such as [email protected] or [email protected] if they don't want to check it on our servers.

The Problem

We've provided this forwarding service for more than 25 years, yet we haven't culled this set for bad addresses in a long time. There is a fairly significant number of members who have forwarding addresses that no longer work. These addresses that no longer work are called BOUNCES.

We recently sent a message to many thousands of members and got over 2,500 bounces. That's a lot of email to read and process! No one was looking forward to that.
A bounce message typically says something like 'Your message could not be delivered to one or more recipients.' and gives a technical reason. These reasons - and their wording - vary for each message.
Even though these messages are not uniform or consistent across domains, a human can read them and tell whether or not it is a PERMANENT or TEMPORARY rejection.
Each bounce needs to be processed by issuing an UPDATE command to our SQL database using information found in the rejection message.

LLMs can process text like humans - let's take advantage of that.

The Solution

We will take advantage of LLMs ability to read like a human and let it do much of the processing for us! But we need to do this programmatically and also not expose any personally identifiable information to public LLMs.

I wanted to keep all the data private, so I tested out local LLMs using LM Studio v0.3.6. You can download it for free at https://lmstudio.ai/. Note that as of January 4th, 2025, v0.3.6 is very new so my antivirus blocked it because it didn't have enough downloads!

I downloaded and tried several models, choosing qwen2.5-14b-instruct for this project. This model is an 8.99GB download that was last modified on November 27, 2024. It supports 128k tokens with yarn rope settings and 8k token generation and is "trained for tool use" - something that I will try soon.

Using the chat feature of LM Studio , I loaded the qwen2.5-14b-instruct model and tested various prompts. I settled on the following system prompt and inserted various rejection messages in the {MESSAGE_BODY} part.

Prompt used for each message, inserting attributes from each rejection message into {MESSAGE_SUBJECT}, {MESSAGE_DATE}, and {MESSAGE_BODY}.

A typical bounce message will have technical information like what is shown below - sort of cryptic to a layman, but good information for those that have experience with email systems:

领英推荐

Adversaries Beef Up Their Shiny Object Distraction…

David Spark 2 年前

Turning Election Anxiety into Resilience: A Guide for…

Joshua Peskay 9 个月前

Keeping Your Nonprofit Safe with Pen Testing

techbridge 1 个月前

Typical rejection message in SUBJECT, DATE, and MESSAGE_BODY format at the end of the prompt.

The result from the LLM for the above bounce would be JSON like this:

JSON produced by LLM after processing the prompt in

The processing took about 5 seconds locally on my PC (Windows 11 Pro, i9-14900KF, 64GB RAM, NVIDIA GeForce RTX 4090).

That means the LLM can process all the messages in 3.5 hours rather than a human reading them individually, which would take at least 4x as long to read and extract the information. I'll take a computer doing the work over me spending 14 hours on a repetitive task any day!

Note that it has everything that we need to make a decision later:

SUBJECT and DATE of the message so that we can confirm it was a rejection (some other messages that are not bounces come to this mailbox too).
USERID, TO, and ORIGINAL_TO so that we can confirm the email addresses and user ID in our system.
DETERMINATION, REJECTION_REASON, and EXPLANATION so that we can better understand what the LLM thought about this message and why it made the decision.
SQL_COMMAND so that we can run this command later if everything looks good.

Doing it Programmatically with Python

Once I had this satisfactory prompt, I needed to automate the processing of 2,500+ messages via Python.

First, I started the local API endpoint in LM Studio for qwen2.5-14b-instruct - it is literally as easy as pushing a button.

I got some assistance from ChatGPT 4o and created the following loop in Python:

Make a SQL query to the database that holds all the bounced email messages.
For each message, use functions from python's email library to parse out the MESSAGE BODY.
Call the API for each message (system prompt + MESSAGE BODY).
Append the resulting JSON to a file that will be used later.

Still some errors every once in a while, so I'm not going fully Agentic yet...

There are still a few cases of mistaken TO or USERID, so I'm gathering this as a JSON and will process the SQL updates programmatically with Python later.

Next step will be to make it agentic and let it modify the database itself!

Jason Quinley

AI & Innovation ||President, Georgia Data Science Association||Mathematician, Linguist|| Ask me about #datascience #startups #ai #llms #insurtech #nlp

2 个月

Nice stuff! We should get you to discuss what you've learned at a meetup sometime!

1 次回应

Jian Wang, Ph.D.

Optimizing Revenue through Data Science

2 个月

Very impressive Warren! It's been a while. We should catch up sometime later.

1 次回应

查看更多评论

要查看或添加评论，请登录

Warren Hearnes, PhD的更多文章

What would a 580,000x speedup do for machine learning?

2019年1月22日

What would a 580,000x speedup do for machine learning?

A 580,000x speedup? As computers get faster, it's natural to imagine what the next generation of smartphones, cars, or…

1 条评论
Two Important Sections on a New Manager's 'To Do' List

2015年7月22日

Two Important Sections on a New Manager's 'To Do' List

You've earned a reputation as an extraordinary individual contributor by getting things done, checking items off your…

3 条评论

Example: Using LLMs to Classify Email Bounces

Warren Hearnes, PhD

Founder, OptiML AI & CEO, Sportslytx | AI+ML+Optimization Leader | Veteran | Chief Analytics/AI/Data Science Officer

The Background

The Problem

The Solution

领英推荐

Doing it Programmatically with Python

Warren Hearnes, PhD的更多文章

社区洞察

其他会员也浏览了

The Intersection of Physical and Digital Security for Non-Profits

‘Tis the Season to Think about Your Institution’s Online Giving Security

Advancing ICS Security Worthy Causes

Navigating the Future: Key Challenges for Local Governments and Nonprofits in 2025

The 13 Biggest Small Business Technology Stories Of 2020

How to Implement Secure Payment Processing for Nonprofit Organizations

How to Implement Secure Payment Processing for Nonprofit Organizations

Ensuring Secure Payment Processing Solutions for Nonprofit Organizations

Stay up-to-date with our weekly newsletter

Nonprofit Sector Trends to Watch in 2025

The Background

The Problem

The Solution

领英推荐

Doing it Programmatically with Python

Warren Hearnes, PhD的更多文章

What would a 580,000x speedup do for machine learning?

Two Important Sections on a New Manager's 'To Do' List

社区洞察

其他会员也浏览了

The Intersection of Physical and Digital Security for Non-Profits

‘Tis the Season to Think about Your Institution’s Online Giving Security

Advancing ICS Security Worthy Causes

Navigating the Future: Key Challenges for Local Governments and Nonprofits in 2025

The 13 Biggest Small Business Technology Stories Of 2020

How to Implement Secure Payment Processing for Nonprofit Organizations

How to Implement Secure Payment Processing for Nonprofit Organizations

Ensuring Secure Payment Processing Solutions for Nonprofit Organizations

Stay up-to-date with our weekly newsletter

Nonprofit Sector Trends to Watch in 2025