登录查看更多内容

Understanding Prompt Engineering Hyperparameters for Enhanced Performance of LLMs

Mahima Chhagani

Machine Learning Engineer| Generative AI, LLM

发布日期: 2024年2月27日

Last week, we explored how to create effective prompts to ensure the desired result from Large Language Models (LLMs). This week, as promised, we will dig deeper into understanding hyperparameters and how to adjust them to suit different scenarios.

In our previous discussion, I outlined several parameters: temperature, top p, top k, max tokens, stop sequence, frequency penalty, and presence penalty. We learned that temperature controls the degree of randomness in the language model's responses. In addition to that, temperature influences the softmax function of the LLM. A higher temperature (more entropy) leads to a more uniform output distribution, while a lower temperature (less entropy) results in a sharper output distribution as shown in below gif.

Now that we have understood the concept of temperature, let's have a look at the other parameters.

1. Context Window - Managing the extent of input text for review:?

The context window is like the working memory of LLM i.e. the input text that it can analyze for generating responses. For example, if you provide an article to the LLM and want to summarize each paragraph (containing 60-70 tokens) into a single line, adjusting the context window to 60 tokens will enable the model to scan through the article paragraph wise without the context of the previous paragraph. Below are the context window capacities of the 2 popular LLMs:

GPT-4: 128,000 tokens

Gemini: 1,000,000 tokens

Widening the context window comes with both advantages and disadvantages. On one hand, it improves the model's understanding and its ability to produce relevant text. Conversely, it also imposes a significant computational burden, resulting in the need of more processing power and time.

2. Max Token - Limiting the amount of output text generated:

Utilizing Max tokens or Token Limit can regulate the length of output tokens produced by the LLM. OpenAI estimates that one token equates to approximately 4 characters (Token Calculator: https://platform.openai.com/tokenizer).

For example, setting the max token limit to 27-30 will prompt the LLM to generate concise, one-liner answers, which could be useful in some cases.

3. Top p, Top k - Variation in the generation of the tokens:

Temperature, top p, and top k parameters help control the randomness of the output. It's crucial to adjust these values to achieve the desired output. While temperature affects the softmax function, top p and top k use different approaches to select tokens from the pool of selected random tokens by the softmax function.

In top k, the LLM picks from the next best k tokens with the highest probability. In top p (nucleus sampling), the LLM selects the smallest number of top tokens such that their cumulative probability is at least p.

4. Stop sequence - Indicating when to stop:

The characters specified in stop sequence dictate where the LLM will stop text generation. For example, if you set the stop word as ".", the LLM will generate text only until it encounters the "." character.

5. Frequency and Presence penalty - Controlling token repetition:

Frequency penalty?parameter helps the model in avoiding the generation of repetitive tokens, such as words or phrases. As the frequency of a token's repetition increases, so does the penalty associated with that token, thereby limiting its generation.?

Unlike frequency penalty, presence penalty applies the same penalty to repeated tokens regardless of their frequency of repetition. This encourages the model to utilize tokens other than those subjected to the presence penalty.

领英推荐

? The In-Context Revolution

Pascal Biese 9 个月前

?? Getting RAG Right: All in One Go

Pascal Biese 4 个月前

Artificial Intelligence #187

Andriy Burkov 1 年前

Personal Experience:

In my experience,? if the goal is to generate responses in a scripted manner similar to a classifier, setting the temperature to a lower value and defining max tokens value, top p and top k can help mitigate the generation of irrelevant or fake data and reduce the cost associated with generating output tokens.

However, if the objective is to enhance the LLMs’ creativity and enable it to engage with customers more dynamically or with longer texts , increasing the temperature value, along with considering frequency penalty, and presence penalty become equally valuable.

Examples of the use cases of gpt-3.5-turbo model:

For customer service chatbot for a Tennis Club I used the below parameters:

temperature=0.27,

max_tokens=70,

top_p=0.56,

frequency_penalty=0,

presence_penalty=0

For an AI that helps humans through difficult times I used the below parameters:

temperature=1,

max_tokens=105,

top_p=1,

frequency_penalty=0,

presence_penalty=0.15,

You can adjust the above based on the response of the LLM to input values.

Conclusion:

In conclusion, understanding and adjusting parameters are key to optimizing the performance of Large Language Models (LLMs) to unlock its full potential. These parameters play critical roles in controlling the randomness of responses, managing input and output text lengths, and controlling token variation for specific objectives. Whether aiming for scripted responses or fostering creativity and dynamic engagement, the appropriate selection of parameter values based on model's response is essential.

?? Try out tweaking these parameters yourself:

?? ChatGPT playground: https://platform.openai.com/playground/

?? Gemini playground: https://huggingface.co/spaces/Roboflow/Gemini

?? Mistral AI playground: https://mistral-playground.azurewebsites.net/

In my upcoming article we will further discuss how to reduce cost of using these LLMs in the best possible way. Hope you enjoyed the article. Feel free to leave a comment if the article was useful to you or you have anything to discuss.

Happy reading ?? . For more such articles subscribe to my newsletter: https://lnkd.in/guERC6Qw

I would love to connect with you on Twitter: @MahimaChhagani. Feel free to contact me via email at [email protected] for any inquiries or collaboration opportunities.

All About AI

527 位关注者

Piotr Malicki

9 个月

Can't wait to dive into this article on maximizing LLM performance! ??

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

Understanding Prompt Engineering Hyperparameters for Enhanced Performance of LLMs

Mahima Chhagani

Machine Learning Engineer| Generative AI, LLM

领英推荐

All About AI

527 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

Artificial Intelligence #187

??Top ML Papers of the Week

The Convergence of Computer Vision and LLM Models: Unlocking New Possibilities in Text Extraction from Video Streams and Images

Watch#7: Small Tweaks with Big Impact

??Top ML Papers of the Week

Top LLM Papers of the Week (October Week 4, 2024)

??Top ML Papers of the Week

Is OpenAI’s O1 Model a Scam? An In-Depth Look at the Debate

Successfully Mitigating LLM Bias: Introspection & Prompt Engineering with LLM-Genie!

领英推荐

All About AI

527 位关注者

Choose the Right LLM: A Guide

2024年4月24日

Maximizing Effectiveness of Large Language Models (LLMs): Fine-Tuning Methods

2024年4月3日

Maximizing Effectiveness of Large Language Models (LLMs): Advanced Prompt Engineering Techniques

2024年3月27日

Understanding LLMs: Introduction, Challenges and Evaluations

2024年3月20日

Example Implementation of Cost-Saving Strategies for Large Language Models(LLMs)

2024年3月13日

Cost-Saving Strategies for Large Language Models(LLMs) - Part 1

2024年3月5日

Generating Effective Prompts for Large Language Models: A Guide

2024年2月21日

Generating Effective Prompts for Large Language Models: A Guide

2024年2月20日

Is the Google Data Analytics Certificate for you and is it worth the time?

2021年10月26日

社区洞察

其他会员也浏览了

Artificial Intelligence #187

??Top ML Papers of the Week

The Convergence of Computer Vision and LLM Models: Unlocking New Possibilities in Text Extraction from Video Streams and Images

Watch#7: Small Tweaks with Big Impact

??Top ML Papers of the Week

Top LLM Papers of the Week (October Week 4, 2024)

??Top ML Papers of the Week

Is OpenAI’s O1 Model a Scam? An In-Depth Look at the Debate

Successfully Mitigating LLM Bias: Introspection & Prompt Engineering with LLM-Genie!