Building Prism: How We Developed Our Domain-Specific Language with Patrick Lee

Building Prism: How We Developed Our Domain-Specific Language with Patrick Lee

Meet Patrick Lee, who developed Clockwise Query Language—our domain-specific language that powers our LLM experience—and taught himself the latest AI techniques as he helped build Prism.

Patrick Lee, an economics and math major turned self-taught engineer, loves building things people will use and find helpful. In his prior role before Clockwise as an engineering manager, he used Clockwise with his team. When he began looking for a new role, an open position at Clockwise aligned with what Patrick was looking for, and as he put it, “I jumped at the chance because I thought Clockwise was a cool product.”

I recently sat down with Patrick to learn more about how he and the team researched DSLs and how the development of our own—Clockwise Query Language—enabled us to deliver a fast, nearly magical, executive assistant-like experience for our users.?

The Early State of Prism

When Patrick joined Clockwise in the spring of 2023, Prism was nascent. In the very early stages of our internal build, a scheduling prompt would be submitted, and we made ten requests to GPT-4 to classify their request. Then, we’d synthesize the results and feed this information into our scheduling engine to provide Clockwise users with scheduling suggestions.

It was the halcyon days of the naive “GPT wrapper,” and our spaghetti-on-the-wall approach didn’t provide us with the results we sought. It was costly, error-prone, and a very slow experience. The process would take around 30 seconds (and 60 seconds on the high side), a responsiveness far too high for the type of user experience we intended to provide.

The Discovery of Domain-Specific Languages and Building Clockwise Query Language (CQL)

With an understanding of our current approach and the desire to find a better way, Patrick and the team began researching other options. They read whitepapers, engaged with the latest Large Language Model (LLM) research, and absorbed new ideas for us to experiment with.?

While researching, they found a paper (Grammar Prompting for Domain-Specific Language Generation with Large Language Models) that unlocked a missing ingredient that could help us improve our approach by translating a user’s intent into a plan that could then be executed in Clockwise’s systems for our scheduling engine.?

Large Language Models generally perform best at generating programming language output because they are trained on vast quantities of code and documentation. So the idea was that if we could go from user utterances to something like programming language output that represents their intent, we could drastically improve our results. So, our team began developing our domain-specific language, which we named Clockwise Query Language (CQL). Based on what the paper suggested, our hope was that we could prompt the LLM with the information needed to translate a request into CQL quickly.

Prompting GPT-4 on CQL?

Since CQL does not exist on the open web, we provided GPT-4 with a giant, book-like prompt with all of the necessary information and CQL context to receive a CQL output we could then input into our scheduling engine.?

Patrick described the prompting effort as extensive: “Essentially, you have to feed it the language definition, what each property and action does, and examples of how to use it in context to help it understand what kind of user utterances should map to specific CQL sequences.”?

Despite the effort, the development of CQL allowed us to condense our ten-request approach into one prompt while providing GPT with the context needed to translate it into what it translates to CQL. This approach was much faster and more reliable than our previous strategy, eventually achieving an average of 8-10 seconds in response time.?

We inched closer to the results we were striving for, but we knew there had to be a way to reduce the latency of processing requests further.

Creating Synthetic Data and Fine-Tuning Our Own Model

While GPT4 with a DSL gave us a solid start, it wasn’t fast enough, so we experimented with training our own model with various open-weight models in the ecosystem. This experimentation included GPT4-Turbo, GPT4o, Mistral 7B, Mistral Large and Claude 2. We ultimately landed on Llama 3 8B from Meta.?

To train our model, we needed to switch from our massive “book-like” prompting technique that we used with GPT4 and instead create a viable training dataset to fine-tune the model. In practice, this looks like thousands of examples of prompts and the resulting CQL. Previously, Patrick handwrote CQL examples, but unsurprisingly, writing thousands of permutations by hand wasn’t scalable.?

Instead, he created a pipeline to generate synthetic mappings between user prompts and CQL. The pipeline worked as follows:

  • Sourcing hundreds of prompts internally from Clockwise employees
  • Using a model to translate each sourced prompt into many permutations and the resulting CQL for those permutations
  • Validating the output of the LLM
  • We now use this output to fine-tune our model on Llama 3 8B

The result of our fine-tuned model is that we can go from prompt utterance to CQL in under one second (the average latency is currently at 0.97 s).

One of the main benefits of this approach is that we never train with user data because we’ve created synthetic data for training purposes. And?Clockwise users don’t have to worry about privacy as we train and run our model on our own infrastructure. The net result of this training approach is that users can treat us like their executive assistant without worrying about their privacy, and we can provide almost magically fast responses to a wide array of scheduling requests.

As more people use Prism, we are excited to continue driving improvements to our model and adding functionality to the experience. We hope this sheds light on the power of DSLs and how we’ve prioritized a fast and accurate user experience for Clockwise users’ calendaring needs.?

In our next interview, Andy DuFrene will explain “multi-hop”—our powerful “fixable conflict” algorithm for scheduling the impossible.

Jorge Alcántara

AI Product Engineering | Co-Founder & CEO | 10y Building & Deploying Enterprise AI | Lecturer

6 个月

Fantastic read, Gary! Kudos to Patrick and the team for tackling latency and privacy challenges so effectively. Looking forward to your next piece on the 'fixable conflict'!

回复
Jeevan Raju Ch

Gen AI Solutions Strategist | Driving Business Growth with AI & ML

6 个月

Gary Lerhaupt, that's impressive! It's amazing to see how technology is advancing and improving our daily tasks. I'm curious to know more about the Clockwise Query Language and how it contributes to the accuracy and reliability of the scheduling process.

回复
Michael Tappel

Staff Product Manager at Clockwise

6 个月

Great article!

回复
Danyaal Rangwala

Teammate at Clockwise

6 个月

genius!! ??

回复
Kat Corzati

Former Figma, Apple | Customer Support and CX leader with hypergrowth, change management, and risk hardening expertise

6 个月

  • 该图片无替代文字

要查看或添加评论,请登录

Gary Lerhaupt的更多文章

社区洞察

其他会员也浏览了