Serious about Enterprise LLMs? 11 building blocks to keep in mind.
Sanjeev Somani
CEO @ Tribyl. Improve revenue conversion by eliminating guesswork and opinions.
Introduction
ChatGPT amassed 100 million users just two months after launch.? A recent Gartner poll indicates that 75% surveyed organizations are exploring generative AI and Large Language Models (LLMs), with 19% in pilot or production mode already.? The buzz is real.
For enterprises, the real opportunity lies beyond content generation and helping workers save time.? It lies in reimagining problems, making workflows smarter, and unlocking significant economic value.??
Figuring out how to write a prompt is just the tip of the iceberg!? Enterprise use cases require a purpose-built architecture that solves for trust, accuracy, relevance, consistency, security, manageability, adoption, and ROI.?
Tribyl was founded soon after transformer models first came out in 2018. Based on practical experience building Enterprise NLP products (using models like BERT, XLNet, GPT 3+), this article introduces foundational capabilities required to make LLMs a real ‘co-pilot’ for Enterprises.? I’ve previously referred to these capabilities as a “semantic intelligence layer”.???
If you’re a Buyer, this checklist will inform your LLM strategy and vendor evaluation.? If you are a Founder or investor, these insights will prove valuable for your product roadmap and investment theses.? Either way, I’d love to hear your thoughts and feedback!
The Enterprise LLM iceberg, exposed.
The Gartner poll mentioned above goes on to say, “initial enthusiasm for a new technology can give way to more rigorous analysis of risks and implementation challenges.” ? I agree.? Here are 11 building blocks to ensure the current A.I. hype doesn‘t end up in another A.I. winter.?
1. Data lakehouse:? Are you providing users new and actionable intelligence to improve outcomes, or are you replacing dashboards with prompts?? Your dashboard can tell you the current win rates, but can it come up with a plan to drive them higher?
LLMs have the power to answer the WHY behind metrics, and to recommend HOW to improve them.? Doing so requires marrying unstructured (context) data with structured (operational) data.??
For example, to diagnose and improve revenue conversion, it’s important to combine 3 data sources:
Questions to consider:
2. Context and Relevance:? Large Language is not the same as YOUR Language! ? Is ‘Security and Compliance” a Use Case, or a step in your procurement process, or both? ? Is ’Customer 360’ a Use Case, and ’Unified Customer Profile’ the enabling product feature?? Or is it the other way around?
Without context, LLMs will hallucinate and create noise, driving mistrust, and a premature ending to your vision.??
Questions to consider:
3. Consistency and Accuracy: If you’ve spent any time writing prompts, you know that the slightest of wording change can lead to big differences in output. Imagine showing up to a meeting where everyone’s got their own version of the truth!
Planning to hire an army of expensive prompt engineers to ensure quality and consistency?? Look no further than the fate of Business Intelligence and the infamous dashboard backlog.? You can’t throw people at the self-service problem.
Questions to consider:
4. Transparency and Explainability: ‘Take my word for it’ doesn’t fly when an Executive asks for reasoning behind your analysis.? Unfortunately, LLMs are opaque models that can’t be easily explained.??
To get around this problem, it’s important to separate your organization’s knowledge model from generic language models. ? For example, a knowledge retrieval system (powered by ElasticSearch or a vector database) can accurately identify and annotate call transcripts involving a discussion on the ‘Security and Compliance’ Use case.? An LLM model can then synthesize the discussion.? Users can easily verify the summary by drilling into the source transcripts and skimming the (contextually-relevant) annotations.?
Questions to consider:
5. Predictions and metrics:? What outcomes are you “hiring” the LLMs for?? While saving time is a common benefit (e.g., summarizing / generating content), that alone won’t get you promoted, nor will it get your CFO to fund LLM projects.??
Take prospecting emails, for example.? Soon, we’ll be inundated by LLM-generated emails that you can’t tell apart from competition.? What if LLMs could personalize emails by rinsing-repeating messaging that drove the highest win rates last quarter???
To drive such outcomes, LLMs need to sift signal from noise.? They need to distinguish between correlation and causation.? Just because the ‘Security and Compliance’ Use Case got discussed in many deals, doesn’t mean it was pivotal to winning all of them!??
Questions to consider:
6.? Job-specific user experience: ? High-impact LLM use cases are going to manifest as new or reimagined workflows.? To drive adoption, we’ll need job-specific, ‘intelligent UX’ design.??
For example, it’s great that your call recording tool uses LLMs to summarize calls.? That’s still 1000s of summaries, though.? Will Product Marketing have time to review them, and produce a monthly win/loss report?? Can Sales Enablement extract repeatable sales plays for landing and expanding customers?? Can reps use the insights to personalize emails and calls in seconds?
As you think through UX, it’s important to keep root cause analysis in mind.? For example, if your sales enablement content is stale and generic to begin with, summarizing it with LLMs won’t drive adoption and impact.??
领英推荐
Questions to consider:??
7. Training and feedback:??
Your business is evolving rapidly.? For LLMs to stay relevant and support daily processes, it’s important that non-technical users be able to train the LLM models easily and safely.??
Examples from the go-to-market world that’ll trigger retraining -- new use cases, personas, product features, solutions, competitors….??
Questions to consider:
8.? Performance and latency:??
ChatGPT is a single-user experience with usage and prompt size limits.? Enforcing limits for Enterprise applications will end badly for adoption.
Yet, applications can come to a standstill without performance optimization.? Here’s a scenario:
User 1:? What revenue was generated by the ‘Security and Compliance’ use case last quarter?
User 2:? -- 5 minutes later -- ditto, but only for new customers signed last month.
User 3:? -- 10 minutes later -- ditto, but only for the financial services vertical last quarter.
As you can tell, results from User 1’s prompt are sufficient to answer User 2’s and 3’s prompts.? Enterprise LLM tasks can involve dozens of context filters.? So how do we prevent 100s of similar prompts from hitting the data lakehouse over and over?
To ensure reasonable response times and performance, queries must be cached.? The caching logic must be smart enough to parse prompts and serve up cached results, when possible.? The cache must update itself incrementally.
Questions to consider:
9.? Cost of ownership:??
New use cases, more data sources, growing users, faster performance….can lead to higher cost.? ChatGPT is funded by outside capital.? Cost to serve isn’t the #1 priority.? The same can’t be said of Enterprise LLM applications.? You need to forecast and budget for one-time and ongoing costs.??
Questions to consider:
10.? Security and compliance:? Remember BYOD -- Bring your Own Device to work?? Eventually, the CIO clamped down, and it was a win-win:? users got better support, while reducing security risks.??
The same’s going to happen to the current BYOP movement - Bring your Own Prompt.?
A consequence of using LLMs to make workflows smarter is that intelligence will be stitched together from a variety of data sources, including those that users didn’t have prior access to.
Questions to consider:
11.? Future proofing:??
If you’re with me so far, the biggest question to ask is -- how do we solve for the above LLM building blocks in the Enterprise?? Here’s why it matters: over 2/3rds of respondents in the same Gartner poll said they want to use LLMs to drive revenue growth and customer retention.? That corresponds to 1000s of SaaS tools in the marketing, sales and customer success space. Imagine dealing with as many LLM implementation approaches!??
Questions to consider:
Conclusion
The real promise of large language models lies in surfacing new intelligence and reimagining Enterprise workflows.? Adding shiny LLM features to legacy SaaS tools is barely scratching the surface.??
Implementing this vision requires a purpose-built foundation described above.? We’re calling it the “semantic intelligence layer”??
Where will this layer sit?? In each of the 100s of enterprise tools?? Likely not, for reasons discussed earlier.? We think this is a new category, as it requires a ground-up architecture that plays nice with all tools and data sources (current and future). ? That’s the approach we’ve been taking in building Tribyl.
Who will be the winners and losers because of this disruption?? Will the current hype cycle end in another A.I winter?? It’s still early to tell.? A lot depends on if -- and how fast -- customers and investors make the shift from the SaaS-first playbook they’re used to, to an A.I.-first one.? There are bound to be significant differences in GTM, product, pricing, adoption, fundraising, and exit strategies.?
What do you think???