Mastering ChatGPT & Large Language Models: Tips and Tricks for Using, Understanding and Engineering Your preferred conversational agent
Generated with DreamStudio

Mastering ChatGPT & Large Language Models: Tips and Tricks for Using, Understanding and Engineering Your preferred conversational agent

Hi folks! I guess you’ve all heard about ChatGPT and generative models in the past few weeks.

But first, let me reassure you that this post is not about doing hyperbolic predictions or statements about Generative AI, and this article is not written with the help of ChatGPT to showcase its capabilities.???

Rather, it is a quite pragmatic one sharing some insights and resources for how to best use, understand and eventually customize the so-called large language models (LLMs) and next generation chatbots.

Indeed, I believe this technology will be progressively diffused in our digital landscape and it will be an important one to master, just like spreadsheets, slide decks, social networks, etc. In this context, it’s crucial to make sure that everybody gets basic training to use it correctly. Then, we also need to make sure that in every organization there is enough practitioners that understand the theory behind it. And finally, we’ll also need at least a small number of LLM experts that know how to customize/fine-tune LLMs for specific use cases.

While the best way to begin to learn using ChatGPT is simply… using it, I also believe that to get better at it, you have to understand a bit more deeply some basic concepts and to read a few resources about them: prompting, zero-shot learning, one-shot learning, few-shot learning and chain-of-thought:

  • Prompting?is simply the process of providing an input or a phrase to the model, which is then used to generate a response or a continuation of the input. In other words, it’s providing the model with the relevant context or direction, so that it can generate a consistent, coherent, and relevant output. It can be a few words or a longer sentence. While the maximum prompt size for ChatGPT is not publicly documented, OpenAI documents that?GPT-3.5?(on which ChatGPT is based) has a maximum request size of 4000 tokens (with a token roughly equivalent to 4 chars in English). I definitely recommend reading the?OpenAI documentation?for designing efficient prompts, the?Awesome ChatGPT Prompts?GitHub repository that provides very cool and creative prompt examples, and the accompanying and free e-book?The Art of ChatGPT Prompting: A Guide to Crafting Clear and Effective Prompts.
  • Zero-shot/one-shot/few-shot learning:?sometimes, like with a human, giving a task to ChatGPT (or any LLM) without instructions or examples confuses it and does not provide relevant answers or outcomes. Zero-shot is when the model predicts the answer without any instruction. For example, with the following prompt: “Translate English to French: cheese =>”. One-shot is when in addition to the task description, the model sees a single example of the task: “Translate English to French: see otter => loutre de mer, cheese =>”. And few-shot is when multiple examples are provided. Of course, LLMs tend to provide better results when you give them some examples as input. It is important to note, that contrary to fine-tuning that will be discussed later, when you do one-shot or few-shot learning, there is no model parameter update. You might not realize it, but in the history of AI it is a fantastic breakthrough to have models able to perform multiple tasks without being fine-tuned!
  • Chain-of-Thought prompting (CoT):?LLMs and more specifically ChatGPT tend to hallucinate sometimes. That means, they simply invent facts or provide wrong answers. Interestingly enough, it has been demonstrated that if you craft your prompt in a way that explicitly asks the model to explain its “reasoning” step-by-step instead of providing a plain answer, the result will be more accurate. So, instead of simply asking a question, you can put “Let’s think step by step.” as a prefix. For example, asking ChatGPT: “Let's think step by step. Is it possible to have two consecutive prime numbers beyond 2 and 3?.” gives a much more convincing demonstration than if you simply ask “Is it possible to have two consecutive prime numbers beyond 2 and 3?”.

By leveraging best practices in prompting, one-shot or few-shot learning when it makes sense and CoT prompting, you should become much more efficient at using ChatGPT and other LLMs to produce relevant outcomes.

From Using to Understanding large language models (LLMs)

Now, if you are an engineer and want to understand a bit more the theory and how it works under the covers, you might be interested in the following resources:

Engineering your own LLM?

So far, you should have enough resources to efficiently use language models and understand the theory behind it. Now, you might be interested in deploying or fine-tuning these models for your specific use cases (for example to incorporate into a model some knowledge bases or corpuses of documents specific to your organization).

While (as-of-today) OpenAI does not provide a direct access to their models (you can only consume them through APIs and are limited to the interactions mentioned in the first section of this article),?EleutherAI, a non-profit AI research lab, released a few open-source models trained on the GPT architecture like GPT-J, GPT-Neo or GPT-NeoX. These models are available on?HuggingFace?and the following GitHub repository explains how GPT-NeoX can be trained and fine-tuned with your specific datasets:?EleutherAI/gpt-neox: An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library. (github.com). While GPT-NeoX-20B was developed primarily for research purposes, you are allowed to further fine-tune and adapt GPT-NeoX-20B for deployment, as long as your use is in accordance with the Apache 2.0 license.

The BigScience initiative also introduced?BLOOM, a multi-lingual LLM meant to be completely open source and customizable and published under a specific?Responsible AI License?(RAIL) that also limits the inappropriate and restrictive use cases. You should probably check if your use cases fall into these categories before using BLOOM. For example, biomedical, political, legal, and finance domains are considered as out-of-scope. Like for EleutherAI, the BLOOM model and its variants are available on HuggingFace:??bigscience/bloom · Hugging Face. But, taking into account the size of the model (176 billion parameters), maybe you don’t have the hardware nor the budget necessary to fine-tune it.

?

Conclusion

ChatGPT and the overall “LLM family” technologies will be increasingly used and deployed in the coming years. In this context it is of paramount importance to:

  1. Make sure users know how to best prompt and use language models.
  2. Make sure enough data science professionals know about the theory and the internal workings of these language models.
  3. Make sure organizations know how they can deploy and fine-tune language models for their specific needs if necessary.

While this article only scratched the surface of these areas, I hope you found it useful, and it will help you better use/understand/customize this exciting piece of technology!

Sonia BOUDEN

Always ready to answer the call of Challenge and meet new people !

1 年

Assez sympa ton article Sébastien ! Merci Laurent pour ton partage ! Je te recommande de jeter un coup d'oeil sur ce git : https://github.com/nomic-ai/gpt4all si tu veux t'amuser un petit peu et simplement en local, et si tu veux aller plus loin, tu peux mm aller taper sur des modèles de HuggingFace ;)

Dominique Larue

Cloud CoE Lead for South Central Europe Capgemini

2 年
Alejandro Torres Pérez

Aspiring VC | MBA + MSc Business Analytics | Scaling SaaS & Data Innovation | Expertise in Strategic Growth, Investments & Partnerships | Angel Investor | Passionate About Startups, Sustainability & Entrepreneurship

2 年

Very useful article Sébastien! Thank you ??

Nysrine EL BAKKOURI

Regional Azure Strategy and Innovation Go To Market Lead for Microsoft CEMA (Central Europe Southeast Europe Middle East & Africa)

2 年

Enjoyed reading the article. Very insightful as usual :)

要查看或添加评论,请登录

Sébastien Brasseur的更多文章

社区洞察

其他会员也浏览了