Climbing The GenAI Learning Curve, Safely - Part I
I was at a conference recently where a presenter warned the audience, "Whatever you put on ChatGPT is out there. Gone for good. Out of your control."??
We hear that dire warning a lot and it raises serious concerns about business use of public tools like ChatGPT or Bard. The warning could also be more cautious than it needs to be, and cost you more than it buys in protection. Let's see.??
If you're like me, you like Bottom Lines Up Front (BLUF). When it comes to the safe use of generative AI (GenAI) there are two:??
We'll dig into the first BLUF in this article, and the second BLUF in the next article.?
What Is Generative AI, And How Does It Work???
Most software we use is deterministic. It produces the same output given the same inputs and conditions. And we want that predictability. We don't want to send an email to recipient A and have it go to recipient B, because our email program spotted a relationship. We don't want a spreadsheet to return different results using the same inputs and the same formulae, because the app thought it would be fun to see different sales or budget scenarios.??
By contrast, GenAI is generative. It's designed to produce diverse and even creative outcomes using the same or similar inputs. It literally generates text, images, and music which didn't exist before a user and tool produced it. With most of the software we use, we rely on its predictability. But we use GenAI to introduce creativity and variations. We want it to brainstorm with us. To summarize a report for us. Or to change the tone of an email for us.??
The way GenAI does this is by recognizing language patterns. GenAI tools recognize the relationship of words, phrases, and sentences you use, and then use statistical probability to choose the best sequence of words, phrases, and sentences to return to you. When you hear talk of GenAI training, this is what's meant - training large language models to recognize and use language patterns, based on statistical probability.???
For analysis and prediction to be useful, GenAI must be trained on enormous volumes of data. ChatGPT was trained on 300B words, including scoring and weighting them based on how they were used in sentences. Many proprietary, domain-specific GenAI tools connect to ChatGPT so users can benefit by its deep training and sophisticated pattern recognition and use. More on this, below.?
What Has GenAI Training To Do With Safe Use??
The way GenAI trains and works, tends to limit what others can know about your use. While it's true that GenAI tools read your inputs (called prompts) and can store them and your entire interaction for future training, GenAI's focus on language patterns rather than whole entries tends to limit what others can know and, thus, control some risk for you.??
Consider an example.?
Let's say you cook and want to try a new tomato sauce. You search online for something you haven't tried or heard of. A search engine will use your search terms to return entire recipes to you. All the ingredients, quantities, steps, and times for you to read - as you would expect.??
But what if you use a GenAI tool to find something interesting? And what if I had previously put my grandmother's secret tomato sauce recipe in the GenAI tool you use? Would you find it? The tool read my grandmother's recipe and might have stored it for future training. So why wouldn't it return the recipe to you???
Because it analyzes language patterns to return language patterns to you. It's not programmed to find and return her recipe like an object. Rather than return my grandmother's recipe intact the way a search engine would, a GenAI tool could very well say to you, "Some tomato sauce recipes use a dash of soy sauce at the end" because it saw that in my grandmother's recipe. And because it's designed to be innovative and creative, it could offer that tip along with others - all based on its analysis of thousands (tens of thousands?) of tomato sauce recipes.?
Although our GenAI tool is highly unlikely to return my grandmother's entire recipe to you - because it's focused on language patterns within the recipe - her secret is out for other cooks to try. (I didn't just give it away, did I?) At this point we must ask ourselves about the risk of that, about the harm we might have produced. This gets into our second bottom line, our subjective BLUF, and we'll look at that in our next article.?
What About Proprietary or Domain-Specific GenAI Tools??
I mentioned that proprietary, domain-specific GenAI tools connect to ChatGPT so users can benefit by its deep training and sophisticated pattern recognition and use. Examples include the GenAI-assisted capture and proposal tools DWPA is using as part of its GenAI Discovery Project. The nomenclature here is imprecise and deserves our attention, for a moment.?
Talk of GenAI uses the terms open, public, proprietary, licensed, and more. Boundaries between terms are not so clean that one or two terms always and only apply to ChatGPT or Bard, while other terms always and only apply to ACME Inc's AI-assisted proposal tool.??
领英推荐
It's easy enough for a team or company to sort this out to be sure it's talking about the same platform or product. What's a little tricker, however, is to understand how safe use compares between ACME Inc's AI-assisted proposal tool safely and ChatGPT, Bard, or others of that kind. This is because there's a business and operational relationship between them that isn't well understood - but needs to be.?
For the sake of easy reference, let's divide products this way:??
I realize this might ignore distinctions between publicly and privately held companies, confuse architectures, fail to account for products with free and paid versions, and more. That's okay because making those distinctions won't change what we're saying about safe use.??
One advantage of a private tool for providing capture and proposal support is having the ability to build a document repository of past proposals, resumes, performance reviews, and other relevant corporate documents. This makes your interaction with the tool more specific to your company's content, solutions, writing style, and more. In fact, over time the tool should improve its analytic and predictive capabilities using your documents as source material. This is what we want.?
One disadvantage, however, is volume. The smaller the body of source material, the lighter will be any tool's analytic and predictive capabilities. Vendors solve this problem by reaching back to the Internet of information ChatGPT is built on. Those vendors ensure your private document repository is fire-walled, and often ensure that none of your content - repository, prompts, or prompts and responses - is used to train ChatGPT.??
You get the best of both worlds, this way: The specificity and control of your content with the deep analytic and predictive power of public tools.?
Where there's a business and operational relationship between a private tool you use and a public tool, everything said above about public GenAI tools generally applies.??
So, What's The Bottom Line??
Recall my colleague's dire warning at the conference: "Whatever you put on ChatGPT is out there. Gone for good. Out of your control."??
It's true that the content of your prompts is out there. But it's also true that the way GenAI uses what's out there reduces some risk for you. How safe that feels is a subjective judgment we'll talk about in the next article. But understanding how GenAI trains helps you understand how information you provide in prompts can show up for future users.??
At DWPA, we use public tools knowing that for most uses there's zero chance we'll give competition any advantage - because there's no advantage at stake. There's no soy sauce in the prompts. For uses where there's some chance we'll give something away, we know it's a small chance and we weigh the gain we want from the harm we don't want, and act accordingly.??
We've not used private tools beyond trials so can't speak to our practices with them. But we know the same conditions apply as with public tools, except that private tools have additional safeguards built in. If you use or are considering a private tool, talk to your vendor about the source of content for training, and which part of the platform, product, or tool (there is no one right word) gets trained.??
And whether using a public or private tool, read your tool's privacy policy or statement. Yes, they're not much made for human consumption. But gut it out so you know what's happening to your data. You'll probably see a choice for opting your content out of tool training. DWPA has exercised that option.??
Beyond understanding how GenAI tools train and work, safe use comes down to use cases and risk tolerance. We'll look at that in the next article but, for now, we'll leave you with the thought that you probably already engage in a practice which is like determining GenAI safe use: Asking questions at an industry day, or in written Q&A during a solicitation process.??
You can ask in ways which show your hand, or in ways which don't. You weigh the odds of gaining information to your advantage versus benefiting your competition and neutralizing your gain. You might have done this for years, and it's a risk-reward decision similar to deciding how to use GenAI, especially public tools.?
DWPA has not used private tools, yet, beyond Discovery Project trials, so we can't speak to practices with them. We know private tools have additional safeguards built in. If you use or are considering a private tool, talk to your vendor about how it’s trained and how your data might be included.
Follow DWPA's company page for weekly discovery insights. To learn more or launch your own discovery project, contact [email protected] .