Selecting the right way to start working with LLMs in your business
Benjamin Weiss
Product Management Leadership. Helping companies grow and transform using Digital and AI solutions.
As the world has started to see the immense potential of large language models and businesses, in particular, have started imagining putting them to use, I’ve seen a common mistake play out over and over again — which is, assuming that the only way to leverage a large language model is to go out and train your own foundational model. This couldn’t be further from the truth, and I’ll make the argument that hardly any businesses should be investing their time in developing foundational models (and what those are for, for those unfamiliar!).?
There are at least five major ways to interface with and start working with LLMs today, each well suited for a variety of tasks and use cases. I’m going to step through each of those five, going from the most basic (and cheap) to the most advanced (and costly). Some of these interface methods do not require you to have even a single staff data scientist, and some don’t require even a single developer! In other words, there’s something for everyone here.
What I hope you’ll quickly see is that advanced methods are not necessarily “better” approaches — they’re just suited for specific use cases and require particular resources (people and dollars) to develop and use.
So let’s dive right in.
Method #1: ChatGPT / Bard / Bing (Easiest)
#1 is, by far, the most common way that people are interfacing with LLMs, like GPT3.5, GPT4, and PaLM 2 today. In fact, you’re probably doing it too right now. What I’m talking about is using a third party (and often publicly available) application to converse with a large language model. Some of these tools are free (like Google’s Bard), and some have paid tiers (like Open AI’s ChatGPT Plus) with access to more features and models.?
These tools are a powerful way to support use cases like authoring blog posts, memos, documents, contracts, and that’s only scratching the surface. A quick copy/paste operation from the model’s output may be all that you need to start publishing content in your favorite CMS or send an email in Outlook, for example (I’m sure you’ve done it already!). And sure, some CMSes and email clients will start to natively incorporate LLMs into their applications in the coming months and years, but there is absolutely nothing wrong with using ChatGPT, Bard, or Bing directly to start working with LLMs today.
Writing code is another case where this approach works splendidly. These tools will format code for easy copy/paste right into your preferred IDE. Tools like GitHub’s Copilot and Copilot X will make this more seamless, but if your IDE doesn’t have support just yet, this is still a great way to start working with LLMs.
Good use cases:
-Generating text documents
-Answering simple and complex questions
-Translating languages
-Writing code
What you’ll need to get started:
The beauty of this method is that all it takes is you, a computer, and an internet connection. Paid tiers for services like ChatGPT run about $20/month while others, like Google’s Bard, are completely free (for now, anyway). You don’t need to hire a data scientist or even a developer. Start learning how to write great LLM prompts to get the best possible inferences, and there are tons of great resources to get started. The barrier to entry here is close to zero!
Method #2: Plug-ins / Tools
In recent weeks, we’ve seen OpenAI and Microsoft introduce a framework for building plugins for ChatGPT and Bing AI. Plug-ins are small tools that enable the model to perform tasks that extend beyond on the knowledge and skills that were a part of the original training corpus. For example, ChatGPT’s web browsing plugin allows the model to run searches on the web to retrieve up-to-date information that’s much more recent and accurate than its original training data (which only goes up to September 2021).?
Now that third parties can start writing their own LLM plugins, we have a powerful new interface available to start working with LLMs. The experience here is still going to be much like the first method above, where you’ll be writing prompts directly inside a tool like ChatGPT or Bing. But the difference here is that you’ll be able to feed the model more specific information and capabilities regarding your business that it wouldn’t otherwise have access to.
Instacart’s plugin for ChatGPT is a great example of just that. It gives the model the ability to interface with Instacart’s product catalog API and add/remove items to a shopping cart. For Instacart, this is effectively providing a net new selling channel, and one that’s entirely conversationally driven. There are risks, however. One, you need to look carefully at what data companies like OpenAI and Microsoft will have access and rights to. Even if it’s not going to be used for training future foundational models, you may be giving up more proprietary data than you’re comfortable with within the model’s context window (this warrants a whole separate post around data privacy and security). There’s also the risk of disintermediation should more and more people start their online journeys with tools like ChatGPT… while there may be no commission to pay to companies like OpenAI today, you can bet that it’ll be coming should this behavior truly take off.
The way to think about plugins is… wherever you have an API available in your business, you might have a potential use case where you could be feeding additional unique data or supplying new skills to a large language model to perform new kinds of work.
Good use cases:
-Answering questions using resources the model wouldn’t otherwise have access to
-Providing a natural language interface to your business’ functional APIs
What you’ll need to get started:
To build your first plugin, it’ll help to have at least one developer on staff. You won’t necessarily need a data scientist here as you’re not actually performing any model training here per se. The process of building a plugin is centered around pointing the LLM to your API and giving it the right access to work with your endpoints. If you don’t have any publicly accessible APIs to start with, this is where a developer??(or likely more) will be absolutely essential. Engineers are also best suited to craft the plugin, once those APIs are available. Once the plugin is published, almost any user, technical or non-technical will be capable of leveraging the plugin as an end user (using a third party tool like ChatGPT).
Method #3: API / App
The way ChatGPT interfaces with the GPT-3.5/GPT-4 models is by way of the model’s API. But these aren’t the only applications that are allowed to leverage the API directly - you can too. By signing up for a paid developer account with Microsoft Azure, for example, you’re able to get a set of keys to call these models, and embed those calls right within any application you’re developing. For example, let’s imagine you’re seeing tons of your employees searching for specific elements of your HR policies, and your intranet search is constantly letting them down (how long is that paternity leave policy again?). This is a case where a model can do a great job retrieving specific answers from documents, and make the experience vastly better for your employees. But, you don’t necessarily want your employees going to ChatGPT for this kind of internal information… you still want to keep them within the safe boundaries of your intranet. You should be looking to go the API route and write a small LLM assisted search application on top of your intranet that makes use of an LLM’s API (and potentially a vector database if you have a large trove of documents, but that’s an article for another day!).?
Writing applications that leverage LLM APIs looks a lot like regular old software engineering, but there are some notable differences. The model is effectively replacing a massive amount of logic you’d ordinarily develop in code (or would be impossible to write in code), and that’s possible because of the intelligent behavior these models exhibit, like reasoning. When you call out to the model’s API, behind the scenes, it actually looks a lot more like the way you’ve probably become accustomed to interacting with ChatGPT, meaning that you’re expressing things in terms of natural language as opposed to software code with a particular syntax.?Let’s explore that a bit further.
First, you’re going to be establishing what’s called the system prompt. Here, you’re initializing the model and giving it some context about what its core task is going to be, and how it should go about doing its work (tone of voice, personality, requisite skills, maybe even some examples to help it avoid hallucination). Then, you’ll be feeding the model dynamic prompts based on inputs coming from your application, which may just be relaying exactly what the user provided as an input, or some augmentation of that input that you perform in your software before feeding it to the model. The model’s output is returned to your application in the API response, and your application can do whatever you please with that output, whether that’s displaying text back to the end user on the UI layer, or perhaps writing something specific in that response to a database, or performing other work - it’ll all depend on your use case here. For our imagined HR search application, it’s likely that we’ll be printing the model’s text output right to the screen for the user to read (perhaps with a cited link to the exact HR document where that answer was found by the model).
领英推荐
There are lots of patterns for how you go about making these LLM API calls, and using a Python framework like LangChain can help make it easier to test different approaches, even different models, and see what works best for solving your use cases. There’s no exact science here, and it’s going to take experimentation, even for a seasoned data scientist or developer.?
The API integration path is likely to be the most common way businesses tap into the power of LLMs over time. And it’ll happen not only in first party applications, but also in plenty of third party SaaS and hosted apps, which means all the usual build vs buy considerations will come into play here for businesses. You may find that you can bypass all the cost of experimentation and trial and error by licensing someone else’s AI powered app, particularly for highly commoditized use cases. You’ll see this quickly start to happen for data analysis, for example, where software companies will tune their approaches in a way that common patterns of analysis are performed optimally, and the major difference from client to client is the client data.
You’ll also start to see simple JavaScript implementations where you can incorporate a third party provider’s software into your own applications with just a few lines of code on a webpage, much like the way most “live chat” services are embedded into websites today (they “overlay” the content as opposed to integrating fully inline, but they keep the user within the same domain without ever leaving).?As useful as this “overlay” approach may look like a fast path to AI for your business (especially for AI support agents), I tend to think of this as the “quick and dirty” path, and businesses who invest the time to truly embed models into the fabric of their applications will identify more significant use cases, and build real points of differentiation into their software and their end user experiences.
Good use cases:
-situations when you want to keep users inside YOUR apps
-e.g. a content management system that wants to incorporate generative AI models to author content faster
-e.g. an email service provider that wants to use an AI model to translate emails with the push of a button inside the authoring interface
-e.g. an educational software provider than wants to incorporate an automated grader right into its native software
-e.g. an online retailer that wants to incorporate an AI chatbot into its website to avoid sending phone calls to its call center
-many, many more…
What you need to get started:
Here, software developers are absolutely essential, but even more than that, unless you’re licensing a third party tool, you should anticipate building a product team to manage the full lifecycle of that software. That means assembling a team with a Product Manager, at least one Product Designer, and then at least one Product Engineer and a Quality Assurance (QA) specialist (at a minimum). Most teams will likely want to have a Data Scientist here as well, particularly if they’re going to be fine tuning any models using the business’ own proprietary data (more on that next!)?
Method #4: Fine-Tuning a Foundational Model
I mentioned at the top that too many businesses have falsely assumed that they need to get in the business of developing their own foundational models. Fine-tuning is where they actually should be looking to focus their energy.?
First off, you have understand that developing a foundational model like GPT-3 or GPT-4 not only takes a team of PhD level AI specialists (not to mention AI safety specialists, ethicists, etc.), but perhaps more importantly it also requires MASSIVE compute capability and that comes with serious costs. It’s estimated that training runs for some of these biggest LLMs currently run north of $10-50MM each. That’s just the cost of compute infrastructure and energy. It’s closer to $100MM when factoring in the human labor to develop/license the datasets for training, build the model architecture, perform reinforcement learning, and so on… Foundational models are not cheap.
But you likely don’t need your own foundational model, anyway. Foundational models are trained on massive, widely varied datasets of everything from internet conversations, to books, to code, and more all in an effort to create a neural network with rich capabilities, like the ability to reason, to write code, to understand context and intent. Foundational models are generalists. They’re the “brain.” And while they can do some incredibly remarkable things, the more you try to get them to be an expert on any one particular thing, like your business, you’ll quickly see that they’ve never been trained on YOUR data (how could they, unless you’re making just about everything public!). They’ll often either tell you they don’t know what you’re asking them about, or they’ll hallucinate something entirely plausible sounding, but likely false. That’s where fine tuning comes in.
Rather than building a whole new model, what you can do with fine tuning is train an existing foundation model on on your own proprietary data. The model still retains all the smarts of the foundational model when it comes to completing tasks and being creative, but it now gains some experience working with your business’ data. And with techniques like LoRA, this fine tuning exercise requires a tiny fraction of the compute required to prdoduce a foundational model (and therefore, much lower costs too). This is work that’ll typically need to be performed by an experienced Data Scientist. There are safety considerations to make here too, but more and more, foundational model providers are offering tooling that helps simplify the safety checks needed in this work. I’m oversimplifying here, and the truth is that no one has AI safety solved at this point, but that too is another post, so we’ll leave it at that for right now.?
Once you’ve fine tuned this model on your data, you have a model that’s fully yours, and private (usually, anyways). You can call your model in your own applications using the method described directly above, and that’s likely how you’ll see most of these implementations go. I won’t get into hosting here, but assume that with any very large fine tuned model, you probably won’t be hosting it yourself…it’ll likely be cloud hosted by Google, Microsoft, etc. and you’ll be interfacing with it via API.?
Good use cases:
-situations where you need the model to understand YOUR business’ unique data, and make predictions around it
What you need to get started:
Fine tuning will typically require at least one or more data scientists. You may not need the PhD-level researcher designing the latest model architectures, rather, someone experienced in working with existing foundational models, working in Python, running experiments, testing models, and so… Assuming you’re going to be calling your newly fine-tuned model from a software application you’re developing, you’ll likely also need a product team consisting of at least one Product Manager, Product Designer, Product Engineer, and QA specialist (at a minimum) to bring the model into the right software environment for end users.?
Method #5: Custom/Bespoke Foundational Model (Most Advanced)
Okay, onto the last. Very few organizations will need to go this route, but if you find yourself in a unique situation where you’re in the business of say, protein folding, or earth simulation where language has little to no importance in the work you do, then a bespoke foundation model might be right for you.
But even at that, we’re seeing OpenAI, Google, and other players like Anthropic and Stability AI develop a whole range of foundational models well suited to unique business applications like drug discovery, climate research, healthcare, and more, so you may well find yourself just fine tuning any one of these foundational models.?
I won’t get too deep into what’s involved here, but I will note that you’re going to need to assemble a large team of data scientists, data engineers, developers, product managers, safety experts and ethicists (among other roles) to construct a massive training dataset, design a (likely transformer based) model architecture, build or lease a multi-million dollar GPU supercomputer, train a model, perform reinforcement learning to better align the model weights to drive optimal outcomes and safety, and so on… Expect to spend tens if not hundreds of millions of dollars, and keep these folks on staff for the foreseeable future. In other words, your use case’s expected return must be sufficiently high enough to overcome the high costs and achieve any reasonable ROI.
Good use cases:
-situations where natural language is completely unrelated to the task at hand?
-e.g. a research university performing protein folding or climate simulation
-e.g. a pharmaceutical manufacturer that’s discovering new drugs and needs to simulate molecule interactions
What you need to get started:
As noted above, it’s going to take a sizable team of highly experienced (often PhD-level) data scientists, data engineers, high performance computing specialists, QA resources, AI alignment and safety specialists, Product Managers, and so on to develop a competitive foundational model. You’re effectively starting from scratch, and in most cases, that simply won’t be necessary.?