Case Study: Explorations with my “private” ChatGPT client
Many organizations are taking a “wait-and-see” approach to cloud-based generative AI services such as OpenAI’s ChatGPT and other tools.?And some are just banning these tools outright.?Just last week Apple proclaimed they would be completely banning the use of ChatGPT and other similar cloud-based generative AI services.?Ostensibly this was a preemptive step to protect “data confidentiality” — but who really knows?
And yes — there are currently no assurances that sensitive or proprietary query data is not currently or will not in the future be leaked into the logs or the training sets of future model iterations.
RECAP: So on the one hand you have a toolset which is simply too compelling to ignore.?And on the other hand you have companies reflexively imposing bans on these toolsets.
Where is the optimal path forward here?
Q: What if organizations could run ChatGPT “locally” — just the same as any other proprietary in-house tool??Could it work??If so, HOW would it work?
That question is the basis for what follows.
PROBLEM STATEMENT:
Can “ChatGPT” (or a close analog) be configured to run locally???And if so, would it be robust and performant enough to provide the same rich user experience as its cloud-based variant?
USE CASE:
Evaluate how a local instance of ChatGPT performs as a “Q&A” knowledge-bot.
BACKGROUND:?
An issue some large (and some small) enterprises struggle with is the fragmentation of knowledge across the enterprise.?So-called “tribal knowledge” can reside in isolated pockets or in the heads of a few select specialists.
Delivery cadence can be impacted by how effeciently (or inefficiently) individuals can get access to that knowledge in time-critical moments.
As a result the business may have some process which is generally not well understood — but nonetheless critical to delivery.?And so “sherpas” (aka specialists) arise to provide guidance and to facilitate task completion.
What if a “smart Q&A” agent could be written which effectively performs this function?
METHODOLOGY:
I created a “private” LLM, running locally that does not exchange data of any sort in any way with any online resources.
The LLM I chose to employ is called “GPT4All”.?Other tools used were LangChain, LlamaCpp, Chroma and SentenceTransformers.
OBJECTIVE & GOAL:
My goal was to see if I could create a specialized knowledge-base LLM able to provide the following functionality:
领英推荐
TRAINING DATA:
I trained my LLM on the following publicly available ServiceNow documents:
All told I fed it close to 10,000 pages of documentation.?These documents were vectorized into a set of embeddings which were made query-able via a basic text UI.
RESULTS:
First-pass results were definitely underwhelming.
While the model had an impressive ability to make sense of natural language queries and to respond with answers which bore some relevance to the original query it was only partially able to make sense of the documents it was trained on.
TEST CRITERIA:
You can view the test prompts and the corresponding output on this Google Sheet.
NEXT STEPS:
By performing some targeted optimizations I believe performance can be radically improved.?Finding and applying those optimzations is the subject of a future evaluation.?But generally speaking I will likely explore these areas:
It’s all so reminiscent of search in a way.?The difference being that the end result is not a URL but instead a collection of relevant, distilled text.
Performance is also a factor, as every query takes a minimum of 27 seconds to run locally on a multi-core, M1 MAX.
CONCLUSION:
I see tremendous promise for tools like GPT4All.?However, it is going to take considerable work to customize and tune these models to boost predictability and accuracy so that they might become de facto “Subject Matter Experts”.?And that will be the subject of future posts.
This use case seems like a great way to use this tool. I have been thinking about ways to solve this problem for quite some time. I've applied various approaches with varying success.. none rise to the level of a solution though. I look forward to your next exploration ??