Mastering GenAI in Business: Productivity, Applications & Market Dynamics
Philipp Masefield X GenAI (Mistral Medium, GPT-4 Turbo, DALL-E 3) [^0]

Mastering GenAI in Business: Productivity, Applications & Market Dynamics

TL;DR

The article explores the business implications and potential applications of large language models (LLMs) in three areas: market dynamics, the future of knowledge work, and working with LLMs. It highlights the need for productive, value-generating applications to drive returns in the AI ecosystem and identifies two competitive AI fronts: high-performance, general-purpose models and highly-specialized models tailored for specific use cases. The article also introduces a third position: good-enough models that offer sufficient performance for many use-cases at lower costs. The future of knowledge work section discusses the potential impact of generative AI on knowledge work and jobs, emphasizing the importance of focusing on automating tasks, not entire jobs. Finally, the working with LLMs section offers practical advice for leveraging large language models effectively, emphasizing the need to embrace the probabilistic nature of LLMs, experiment to discover capabilities and limitations, guide LLMs with practical wisdom, and craft prompts iteratively. [^1]

This is the second piece in my three-part series on my generative AI (GenAI) learning journey since ChatGPT's viral launch just over a year ago. In the first article, “Charting GenAI's Course with a Tech-First, Multidisciplinary Approach”, I had:

  1. Highlighted the importance of viewing GenAI through the lenses of Technology, Business, and Society.
  2. Placed current GenAI in the broader context of AI, as a new "fourth wave" resulting from decades of development.
  3. Discussed the key factors driving rapid advancements: algorithms, data, and compute power.
  4. Stressed the need to conceptually understand the technology to identify business opportunities.
  5. Introduced relevant concepts such as models, prompts, and hyperparameters.

The final article in this series will explore the Societal perspective.


If 2023 was the year of 'tech fascination' with the new Generative AI (GenAI) phenomenon, will 2024 be the year of finding productive, value-generating applications?

One driver for this shift towards value will be the economic pressure to generate returns. As Andrew Ng clearly points out, for the AI ecosystem to be sustainable, there needs to be real value-generating applications, whether consumer or business focused, that can generate revenue to support the companies developing these large language models. So far, the space has largely been funded by investors, but for it to move beyond fascination into true productivity, monetizable applications using this technology will likely have to emerge over the next year to close that revenue gap. [^Cent]

In the following sections, we'll explore the business implications of Large Language Models (LLMs) across three key aspects:

  1. Market Dynamics: The market for LLMs is characterized by a two-tiered competitive landscape, with a small number of high-performance, general-purpose models and a broader range of highly-specialized models. There might also be a third position.
  2. Future of Knowledge Work: GenAI has the potential to significantly impact knowledge work, particularly by automating tasks and leveling the playing field for certain types of work.
  3. Getting Hands-on: Effectively using LLMs requires understanding their probabilistic nature, experimenting to discover their capabilities and limitations, guiding them with practical wisdom, and crafting prompts iteratively.

[^Cent]


Market Dynamics

Providers of LLMs

In November 2023, I shared my sketch of how I understood the market dynamics playing out around LLM capabilities, illustrating a two-dimensional landscape of performance and specificity. I concluded that two competitive AI fronts existed:

(1) A very small number of high-performance, general-purpose models that continually push the boundaries of what's possible. Dominated by a small number of players with substantial resources - OpenAI, Anthropic, Google, maybe one or two more players? - this space is as competitive as it is innovative.
(2) A broader area for highly-specialized and much smaller models that are tailored for their specific use cases. Through their smaller size they will be far more economical to build and operate, and can even outperform the largest general-purpose models in terms of quality in their specific niche.

The basic logic still applies, and I still see this as the overall market dynamics. Yet, recently through personal experimentation, I have come to appreciate that there might be a third position:

(3) Good-enough performance models, often open-source, offering sufficient performance for many use-cases at lower costs, enabling new business models (think freemium) or large-scale processing (for a viable cost).

This is a view that is also reflected in a recent analysis concluding that the high-end models will likely capture most of the long-term value. However, it's also evident that second-tier models, which strike a balance between quality and cost, will create a significant market niche worth billions of dollars, especially when optimized. Illustrating this, here are sample prices for different models (as of writing, available on OpenRouter, indicated per 1 million tokens input / output):

.. Category 1: Context >100k: [^2]

  • OpenAI GPT-4 Turbo (preview) $10.00 / $30.00
  • Anthropic Claude v2.1 $8.00 / $24.00

.. Category 3: Use-cases with a 4k context window:

  • GPT-3.5 Turbo $1.00 / $2.00 (note: this is what we had with ChatGPT in early 2023)
  • Nous Capybara 7B, Mistral 7B Instruct, Hugging Face Zephyr 7B: all currently free
  • Meta Llama v2 13B Chat $0.15 / $0.15
  • Mistral Mixtral 8x7B (beta) $0.30 / $0.30 (but for 33k context)

Clearly, depending on the specific use case and business model, alternatives to the default OpenAI model should be explored. Generally, there's a perception that the “AI industry is experiencing a pivotal shift as inference costs plummet”, with highly competitive pricing for GPT-3.5 class models. This segment has become commoditized. In such a market, with a clear leader (and only a few close followers) and a large group of second-tier providers, only a few “can actually make money off these models”:

  • Firms with unique distribution due to direct customer access via full software as a service
  • Firms that offer fully supported service of training or fine-tuning on proprietary data
  • Firms that can provide high level peace of mind (data protection, legal shielding)


AI (-Inside) Products Landscape

The makers of these Large Language Models, first and foremost OpenAI, have ushered in a new era of products and services. Over this last year, we've seen an explosion in end-user products that leverage this technology. I would classify these products into two main categories: those that offer AI-enhanced features for existing products, and those where AI is the primary offering.

In the case of existing products, AI-enhanced features provide value when they introduce new capabilities that align with the core offering or enhance existing functionality. Notion serves as a good example of this. However, it is worth noting that many existing products simply add AI features to capitalize on the current hype surrounding AI.

The second category are the new end-user products that are primarily build on and around generative AI. This category represents the multitude of AI tools that have popped up for specialized use cases. Examples include the countless AI copywriters. These products(*) often merely abstract technical settings with use-case-specific business logic which guides and restricts users, all while providing a user-friendly interface. Technologically, these products can be explained quite easily as a combination of:

  • typically a standard LLM (e.g., GPT-3.5, GPT-4 for pricier offerings, Claude, or increasingly open-source models),
  • standard API calls for model interaction,
  • system instructions guiding the model's behavior (this would be the ‘secret sauce’),
  • hyperparameters adjusting text generation,
  • and finally a user-friendly interface that hides all the above ‘magic’.

(*): There certainly are noteworthy exceptions, products build on GenAI that are far more. One such example that I use frequently is Perplexity, which combines a LLM in the role of a “reasoning engine” with a search index as the “knowledge engine” to create their “answer engine” (excellent interview on Exponential View).

Shifting our focus, what do the rapid advances in AI capabilities mean for the way we work?


Future of Knowledge Work

Early 2023 saw economy-level estimates predicting that generative AI could significantly impact work, particularly white-collar jobs. For instance, Goldman Sachs estimated an AI automation potential of 25-50% across nearly two-thirds of US occupations. In April 2023, initial studies highlighted GenAI's potential to boost productivity in office work. Specifically, LLMs demonstrated a skill-leveling effect for certain call center tasks, disproportionately benefiting lower-skilled workers and raising average skill levels overall (see my visualization attempt). Although many office jobs differ from call center work, and it was only a smaller study, this early evidence remains noteworthy.

So is AI really such a big deal for the future of work? In September 2023, Ethan Mollick answered: “We have a new paper that strongly suggests the answer is YES". As a co-author, he was referring to their very robust study with BCG consultants that confirmed early productivity impact indications, and provided some insights on integrating human and AI capabilities. Based on this study, I wrote about my personal takeaways for approaching and using LLMs.

In all these studies, Andrew Ng 's insight is key: Focus on automating tasks, not entire jobs. As he highlights, the strategy is to discern which tasks within a job's spectrum are suitable for automation. While businesses consistently pursue improvement, GenAI introduces a significant leap in capabilities. It's not solely about core tasks—automating the supporting tasks can also lead to marked efficiency improvements.

Interestingly, a similar yet distinct leveling dynamic can be observed in the entertainment industry: Amateur creators benefit, while mid-tier artists struggle against the sheer volume enabled by AI. However, top stars leverage the technology to extend their reach even further. (The Economist provides a good read on this: Now AI can write, sing and act …).

Given the various studies (and several others not mentioned), I believe we should all be motivated to engage with generative AI and its potential. So let’s get more practical.


Working With LLMs

In working with Large Language Models, remember that perfection isn't attainable by merely handing off tasks. There are different ways to think about this - and the most important message might actually be that we need to actively and consciously think about it.

According to MIT's Thomas Malone (in a 2019 article, so in a time of Machine Learning before Generative AI), ”it’s more useful to think of AI in terms of humans and computers complementing one another within the context of smart groups”. During these collaborations, AI can assume various roles: tools, assistants, peers, or managers. In an MIT course on AI, Professor Malone elaborates on these roles:

  • Tool: AI serves as a tool when it carries out tasks assigned by humans, who generally oversee its progress.
  • Assistant: As an assistant, AI demonstrates greater autonomy and initiative than a tool. It can help formulate and solve problems independently, often in real-time. This role involves AI taking on more responsibility, such as diagnosing issues or responding to inquiries with minimal human intervention.
  • Peer: In the peer role, AI performs tasks comparable to those of humans, sometimes even resolving entire cases independently. However, there are still situations where human intervention is necessary, particularly for complex or atypical scenarios. This role highlights AI's ability to complement human efforts in various domains.
  • Manager: In the managerial role, AI manages workflow and task assignment within organizations. This role encompasses evaluating employees and offering constructive feedback to enhance their performance. By providing real-time analysis and guidance, AI can help improve interactions and outcomes in various settings.

I consider this a reasonably useful way of thinking about working with LLMs, as it can guide the way we interact and what we expect to receive.

Microsoft’s analogy of the “Copilot” in everything is instantly appealing. But obviously, the analogy is imperfect at best, and at times even misleading - at least at the current state of technology. Yes, it is correct that the ultimate authority and accountability lies with the captain on a plane rather than the copilot (so the human user rather than the AI). But as the captain, I don’t think currently AI’s are ready for “your controls” and handling the entire flight and landing …

I am partial to Reid Hoffman ’s notion of seeing GenAI as “human amplification” (e.g., on his “Possible” podcast series), with the key that we as the humans still remain in the driving seat. A good example for this can be the process of writing, where LLMs can play an equalizing part, allowing people struggling with writing to overcome this. Think about it in this way: The writing process can be broken down into the input (or prompt formulation), text generation, and editing. While LLMs excel at rapidly generating text, humans still provide the creative spark through interesting inputs in the prompts and then apply judgment in editing. This allows more people to contribute their ideas by utilizing AI for the text generation step. In that sense, LLMs can act as an equalizer, amplifying input from a broader range of thinkers.


LLMs Are Not Like Software

Unlike deterministic software, LLMs represent a paradigm shift that needs to be experienced firsthand. Tellingly, ongoing empirical research aims to understand LLMs' behaviors, capabilities, and limits (see e.g.,?EmotionPrompt or that just adding a single sentence makes Claude overcome its reluctance to answer). It is remarkable that LLMs exhibit emergent behaviors not readily apparent from analyzing their code.

So the software analogy to understand LLMs does not work. The most effective approach may be to anthropomorphize LLMs, treating their differing capabilities as distinct personalities rather than software functions. So to “treat AI as people [might be] pragmatically, the most effective way to use the AIs available to us today”, and sometimes encouragement can unlock a LLM's hidden potential (as this illustrative example shows). As such, I have found it useful to keep in mind what Ethan Mollick states in On-boarding your AI Intern:

They are weird, somewhat alien interns that work infinitely fast and sometimes lie to make you happy (..) Just like any new worker, you are going to have to learn its strengths and weaknesses; you are going to have to learn to train and work with it; and you are going to have to get a sense of where it is useful and where it is just annoying.

The key takeaway: firsthand experimentation - grounded in a conceptual understanding of the technology [^3] - is essential to develop an intuitive grasp of LLMs' capabilities, limitations, and therefore potential business applications.


Prompt Crafting As The Enabling Technique

I address prompt crafting from a business rather than technology perspective because, the way I see it, it is part of the business-driven interface to the technology. To create value, we need to approach the technology from a business needs perspective - technology as the enabler rather than the driver (aka ‘tech in search of a problem’). And if LLMs are to become our new paradigm for interacting with information, then by extension natural language is the user interface.

There was this time much through the first half of 2023 when LinkedIn was full of listicles of 1-line prompts with headlines in the style of 'XX prompts to save you YYY hours' that received hundreds or even thousands of likes. Luckily, already as early as March or April I had discovered Dave Birss’s CREATE prompting formula as a kind of antidote:

  • Character (what role to play)
  • Request (specific, clear instructions with enough context)
  • Examples (optional)
  • Adjustments? (e.g., don't use bullet points)
  • Type of output (exactly what want, e.g., 300 words)
  • Extras (e.g., ask questions, work step by step)

This formula was instructive for me to quickly gain more useful ChatGPT responses than from those simplistic 1-liners. I'm glad these simplistic prompt listicles now seem to have all but disappeared.

Even though “AI prompt engineering isn’t the future”, it is at least currently still a skill that is a differentiator for the value you can get from using GenAI tools. I’ve reflected before on the difference the quality of a prompt can make and have since continued to experiment with prompt crafting, and learn from other’s insights. Clearly, there are "strategies and tactics for getting better results from large language models", and this also still applies to GPT-4. So how you approach and then craft your prompts does matter. From Open AI's guide I find that in particular the strategy of “write clear instructions” always applies. In a sense, you as the human actually need to take the effort of clearly articulating your thoughts before you can expect even an AI to understand. And also the tactics listed in the guide are a good way to achieve this:

  • “Include details in your query to get more relevant answers” -> This helps reduce the randomness of responses.
  • “Ask the model to adopt a persona” -> This is something I have long been employing and is potentially the most effective way of guiding the model towards your expectations.
  • “Use delimiters to clearly indicate distinct parts of the input” -> This is also a helpful way of structuring longer prompts for better readability.
  • “Specify the steps required to complete a task” -> This is also helpful to make your own thinking process explicit - and by having to make it explicit, it might also improve the quality of thinking.
  • “Provide examples” -> I personally don’t use examples too often but have found them helpful for certain tasks.
  • “Specify the desired length of the output” -> Additionally, specify any other expectations you have for the output, such as formatting.

There are many more points that can be considered in crafting prompts. In the end, it is down to each one of us to experiment what works best for our own needs.


Getting Hands-On

With GenAI having been launched as a consumer service, rather than first as an enterprise product, this still foremost is a personal productivity tool and with that also the ‘burden’ of learning about its use cases and practical application falls foremost on each one of us.


Exploration and How I Think About Use Cases

In the rapidly evolving landscape of GenAI capabilities, hands-on experimentation is not just beneficial, it's imperative for understanding its potential use. As @Nathan Warren points out, the best way to understand these models really is to personally spend time experimenting. Or, as Ethan Mollick emphasizes, only “using AI will teach you how to use AI” in your specific domain and that you need to be “using AI a lot until you figure out what it is good and bad at”.

I would argue that experimenting with and using GenAI needs to happen in two different modes:

  • A general, maybe more playful, explorative ‘trying out things’ to get an intuitive understanding of the jagged capabilities frontier. And as these capabilities are constantly evolving, our explorations should also be an ongoing practice.
  • A practical, use-case driven application with the intention of generating added value, either as efficiency gains or as improved abilities.

As an example of explorative learning, I noticed that while GPT-4 Turbo was overall good at helping me craft sophisticated prompts, it did get confused about the interaction of the different roles. For example, when asking GPT-4 to write a prompt for an interaction between an AI assistant and a user, the part on ‘who asks whom’ got rather confusing, ending up in the AI telling the human to perform all kinds of writing tasks. So based on this learning, I then adapted my approach to roles-based use cases.

Another practical takeaway is that I often reset chats. The main reason is that the LLM retains the full conversation in its memory as part of its context window. This has two practical effects: first, it increases the cost of inference (by operating on far more tokens); and second, at least sometimes, the LLM may begin losing focus on the task at hand.

Based on the understanding gained through explorative learning, we can then approach practical use cases. When considering potential use cases, I ask myself a few key questions:

  • Does it offer efficiency gains, such as increased throughput or at least a reduction in boring tasks?
  • Can it fill a capability gap, such as for example mastering a complicated formula syntax, or do something at a higher quality like proofreading, or scaling tasks that I otherwise simply wouldn’t do? This aligns well with the notion of “human amplification”.
  • Is GenAI the appropriate tool, given its strengths and limitations? Regular exploration and learning are essential to stay aware of GenAI’s shifting jagged capability frontier - a point worth repeating.
  • Does GenAI offer value for the time and money invested? A positive response to the efficiency and capability amplification questions suggests yes, but I view this as a distinct consideration.

To address my use cases, I find it helpful to think in terms of Jobs to be Done, and then to think about what professionals I would like to have on my team to help me achieve the different parts or tasks of this job. Based on this, I view GenAI as empowering my multidisciplinary team, with each professional represented by an AI Persona who will be entrusted with certain tasks. And if a task is defined generically enough, a single persona can be hired to contribute in several jobs.


Use Cases Within Corporate Role

In corporate environments, playful exploration and practical applications face valid constraints. However, they remain possible and essential, with two primary considerations:

  • Limited freedom to choose and use tools or services based on interest or potential.
  • Necessity to avoid exposing internal data.

Recently, at least within my corporate setting, the second caveat has been mitigated: Only just missing the one-year anniversary of ChatGPT’s public debut, an internal SecureGPT (GPT-3.5 based, chat-interface) has been introduced. This development opens up many new use cases that I will consider exploring.

Earlier in 2023, I discovered a few practical use cases that were value-adding to me – here are two illustrative examples:

Bypass Steep Learning Curve of a New Software Syntax

In May 2023, I was setting up the data preprocessing in an ETL scenario (Extract, Transform, Load) and opted for Power Query in Excel due to various reasons. Despite my familiarity with its potential, I had never used it before. Instead of digging into countless user documentation and forum entries to learn the specific formula syntax of Power Query Editor, I turned to Bing, which was powered by an early GPT-4 version with internet access. I described my goals conceptually and had Bing generate the syntax for me. Although not every formula worked perfectly at first, Bing proved to be a capable debugging assistant. This approach enabled me to accomplish in a few hours what would have otherwise taken me several days of learning and struggling.

The ability to simply write prompts detailing what you want to achieve, and then having AI provide the specifics, is a real enhancement and allows you to focus on the value-creating part of a job. The LLM (Bing, on my iPad, in this case), closed a capability gap for me.

Thinking Partner

GenAI can serve as a thinking partner to iteratively work through a problem. AI in this context takes on more the role of a peer rather than of a tool. An example for this usage was thinking about a potential collaboration contract. I had a few high-level ideas and some alternative approaches I wanted to think through. What worked for me was the following approach:

  1. Provide sufficient context to ground and constrain the AI, obviously being mindful of any confidential information.
  2. Articulate the idea you wish to explore and request the AI to outline its implications for the contract. Iterate on this until you've thoroughly explored this idea.
  3. Repeat the process with your other ideas.
  4. Select the two or three most promising fleshed-out ideas and ask the AI to synthesize them into a new solution.

In conclusion, the AI was not able to do something I wasn’t capable of, and more importantly, it didn’t just give me some perfect solution. The point here is that AI facilitated my own thinking, letting me efficiently explore different ideas while keeping me engaged with sometimes novel or unexpected contributions. This was similar to asking a group of colleague to drop whatever they were doing, and to attend my need of collaboratively thinking through my topic.


Personal Pursuit Use Cases

Unconstrained by a corporate setting in my personal pursuits, I have been using the power of LLMs in a value-adding manner for various use-cases throughout much of 2023.

The absence of constraints, however, implies the lack of common built-in safeguards found in enterprise settings. Consequently, as a user you must adopt a more informed approach to data protection and consent, considering your specific use cases. In practice, this actually means reading the terms of use for any service you subscribe to (you may be surprised by what you accept as legitimate data use) and determining if you trust the provider to adhere to these terms.

In short, the business realities still apply in the shiny new world of GenAI: there is ‘no free lunch’ and either you pay for the product or else you are the product.

My Multidisciplinary AI Team

Within the context of my personal pursuits, I have build a multidisciplinary team which allows us together to produce better results than any one of us on our own [^4]. As a practical illustration, here is a snapshot of my current team:

  • A Prompt Engineer who can help me develop prompts based on my objectives or from my input, or can help me improve and refine my prompts.
  • A Ghostwriter to write coherent text from rough notes (directly or via an outline first), rework draft text to achieve some intention, or to revise text - all while trying to emulate my writing style.
  • An Investment Analyst to make a first pass at extracting key information from startup pitch-decks into a standard structure.
  • A Software Utility - yes, a slightly more abstract persona - that can, for example, extract and structure appointment information for a calendar entry from a conversation, or convert a software’s raw JSON export into human-readable and presentable format.

These AI personas have proven to be valuable, time-saving team members in some of my personal pursuits.

And then obviously there are many more prompts that I might be using for singular tasks, less linked to a persona and used in more diverse settings. An example could be to generate a quick summary for a text. Over time though, I’ve noticed that I start to leverage these prompts into more specialized settings, generally also with better results. At that point I have often created a persona that encapsulates many of the expectations towards the output to leverage a more straightforward prompt. In the example of the summary prompt, this might be used by a journalist, or an academic, and result in a distinct output.

How I Use LLMs Practically

Over this past year, my approach to using large language models (LLMs) has evolved significantly. Initially, I, like probably almost everyone, started off simply using the ChatGPT chat interface. As my explorations expanded, I started experimenting with both ChatGPT and Claude, selecting the best tool for each specific task, or switching when hitting usage limits.

The next phase in my LLM journey involved transitioning to paid API-based usage. There are two primary options: the all-in-one, closed system, as offered by OpenAI, and more open systems that provide access to diverse models from various model makers.

OpenAI offers several alternatives beyond the simple chat interface. For instance, the ChatGPT Pro version grants access to the latest model and features like GPTs. Another useful method is accessing models through an API on a pre-paid, pay-per-use basis. Additionally, the OpenAI Playground provides a testing environment to experiment with different models, system instructions, and hyperparameters for conversations. This is an effective way to understand how your application would interact with the API, though it's worth noting that using the playground consumes your prepaid credit (ha!). The key takeaway is that, depending on your use case, this is an easy way to experiment and learn about the significant potential that lies behind steering model outputs using advanced methods like priming with system instructions and adjusting hyperparameters. This approach can also serve as an alternative way to access GPT-4 or GPT-4 Turbo without subscribing to Plus (though Bing would be a free alternative).

Personally, I've gravitated towards more open systems. During my exploration of applications, I've come to appreciate TypingMind, which offers three key benefits. First, it provides API-based access to a wide array of models, both proprietary and open source, without having to deal with the API from a technical perspective. Secondly, it offers a library for saving characters (essentially the system instructions for the AI persona) and prompts. Lastly, the interface allows for efficient model selection and persona utilization, along with the ability to choose prompts from the library or input new ones during interactions. This proves to be an invaluable interface for leveraging various LLMs in a chat mode while supporting efficient reusability of proven prompts.

On a side note, since OpenAI introduced GPTs last November, I regularly reconsider whether I should switch back to the OpenAI world and subscribe to ChatGPT Plus, primarily for the integration ecosystem it is becoming. Shifting towards a workflow-focused approach, as opposed to the current task-centric one, will likely be the next 'evolutionary' step in my GenAI journey. [^Cent]


Practical Advice for Leveraging LLMs

As we wrap up our exploration of the business implications and practical uses of large language models, I am convinced that now is the moment to adopt a pragmatic, hands-on approach to harnessing these high-potential tools.

To support your efforts in this - in case you have not spent the necessary hours of explorative learning so far -, here is a bit of orientation that should not only serve as actionable advice but also demonstrate the practical use of LLMs in a real-world scenario by adding explanations in ‘behind the scenes’. And, if you have spent the tens of hours necessary to gain this intuitive understanding, I would love to hear your thoughts or additions to this practical advice.


Practical Advice for Leveraging Large Language Models (LLMs)

  1. Embrace the probabilistic nature of LLMs: Understand that LLMs generate output based on patterns learned from vast amounts of text. While the results may sound convincing, they might not always be accurate or true. Recognize that LLMs are best suited for specific tasks and not a one-size-fits-all solution.
  2. Experiment to discover capabilities and limitations: Engage in regular experimentation with various LLM tools[^5] to stay updated on the rapidly evolving landscape. This hands-on approach will help you understand the current capabilities and limitations of LLMs, enabling you to make informed decisions about their application.
  3. Guide LLMs with practical wisdom, invest in working iteratively with them: Although LLMs possess extensive knowledge, they lack practical wisdom in applying it. Treat them as highly motivated interns who require explicit guidance to achieve desired results, and frequent feedback loops. Provide clear instructions and context to help LLMs generate useful and accurate output. It's an investment.
  4. Give LLMs a persona: Assign a persona to your LLM to provide a lens through which it can access its extensive knowledge more effectively.
  5. Provide sufficient context: Offer ample context, including reference text or examples, to ground and guide the LLM toward your expectations.
  6. Be clear and explicit: Make your requests unambiguous to avoid undesired results. LLMs should not have to guess your intentions.
  7. Specify output format: Indicate the desired output format, such as detailed explanations, comparison tables, bullet-point lists, or specific technical formats, to ensure the response aligns with your needs.

By following these practical tips, you can collaborate effectively with LLMs and harness their power to generate valuable insights and solutions. Remember, a hands-on approach is essential for success in this rapidly evolving field.


Behind the Scenes of Writing this Advice

How I (with AI-support) generated this practical advice section:

  1. I used an AI-tool (2nd product category in Market Dynamics) that both transcribed my slightly rambling thoughts (speech-to-text) and then rewrote the transcript into a coherent, succinct text. I then added a few points I wanted to include into this text.
  2. Next, I fed the text to my Ghostwriter persona, employing a prompt to transform the content into a list of practical recommendations.
  3. Lastly, I conducted a final quality check (as should always be done!), this time finding no need for any further editing.

For the Ghostwriter task I used Mistral's Medium model, with a total cost of $0.00662 for 1,062 input and 460 output tokens.


For the third and final article in my series, I'll be transitioning our focus from the practical applications we've discussed, to a broader, societal perspective on the implications of these developments.



Endnotes:

Note: Throughout the writing process, I have utilized LLMs to varying degrees, though any significant contributions are explicitly noted.

[^0]: I used GPT-4 Turbo to craft an image description through several iterations, using the summarized article (see [1]) as basis in the initial prompt. Also, I significantly increased the temperature setting compared to e.g., content editing tasks. Once satisfied with the prompt, I then instructed GPT-4 Turbo to generate the image, triggering the DALL-E 3 plugin responsible for image creation, with:

Generate a 16:9 landscape-oriented image depicting a chessboard with uniquely designed chess pieces of varying sizes to symbolize different AI models. The backdrop should feature a digital sunrise that signifies the rise of AI in knowledge work. In the foreground, a human hand should be fine-tuning a gear that seamlessly integrates into a translucent brain composed of network mesh, representing the strategic development and iterative process of enhancing large language models.        

Total cost for the entire process, using a series of models (Mistral’s Medium for summarizing the extensive article, then GPT-4 Turbo to craft an essence-capturing visual description, and DALL-E 3 to actually generate the image), amounted to $0.12 and took me about 10 minutes for thinking and instructing.

[^1]: Article summarize by Mistral’s Medium; unedited.

[^Cent]: Written in a ‘Centaur’ mode: I provide my notes, then my AI ‘Ghostwriter’ persona drafts these rough notes into a coherent text, and finally I do the quality control and necessary edits myself.

[^2]: Google’s Gemini Pro (preview) with 131k context for $0.25 / $0.50 per 1M tokens would be in this tier 1 class, but I have not included it for two reasons: firstly, it is offered as a loss-leader for this preview phase and not for production use, and secondly, my informal non-systematic testing doesn’t have it perform at a tier 1 level, not even comparing favorably with some lower-tier models.

[^3]: I tried to provide a primer on this conceptual understanding of technology that I advocate in my first article of this series.

[^4]: Yes, I admit that this might be my decades of managing and leading teams in a professional setting that shines through ;-) - I firmly believe a team of strong contributors will achieve more than one single star.

[^5]: To get started, I would recommend to explore - beyond the obvious OpenAI ChatGPT - Microsoft's Bing for free access to GPT-4 (though with somewhat modified behavior), and Anthropic's Claude (if accessible) to evaluate its text generation capabilities. By experimenting with these tools, you not only discover the best fit for your needs but also develop a more intuitive understanding of each tool's unique features. This hands-on experience enables you to make informed decisions and leverage AI effectively in your content creation processes.

Piotr Malicki

NSV Mastermind | Enthusiast AI & ML | Architect AI & ML | Architect Solutions AI & ML | AIOps / MLOps / DataOps Dev | Innovator MLOps & DataOps | NLP Aficionado | Unlocking the Power of AI for a Brighter Future??

8 个月

Great insights on the future of AI and its applications in business! ?? #generativeAI #Business savvy

回复
Antti Ekstr?m

Senior Marketing Automation Specialist | Marketing Consultant | ???????? ???????? ???? ?????????????? ???

8 个月

Great insights into the business implications of large language models! The future of knowledge work looks promising with generative AI.

回复
Jürg Stuker

START Global | Kickstart Innovation

8 个月

Thank you for many pointers to be explored. What I particularly like is the view on cost in relation to size/mightiness. Very soon the suppliers could compete on cost efficiency to build and run models that are 'fit for the task'.

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了