登录查看更多内容

DS Fortune Cookies: System Prompts

Scott McKean

Specialist Solution Architect - Data Science

发布日期: 2024年11月29日

"Lucky numbers: 0, 1. Lucky words: Your system prompt."

One thing to understand about language models is that they work on plain text. I found this confusing when doing fine-tuning because most APIs now use a chat completion template. But under the hood, every language model's tokenizer converts all these messages and roles (e.g. system, user, assistant) into plain text with the tokenizer. The model then generates next words based on the tokenizer vector. So once again, language models are just next token prediction on plain text - we just dress them up using tools, chats, etc.

Here is how you would do this conversion with the Hugging Face transformers library.

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")

chat = [
  {"role": "user", "content": "Hello, how are you?"},
  {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
  {"role": "user", "content": "I'd like to show off how chat templating works!"},
]

tokenizer.apply_chat_template(chat, tokenize=False)

>>>

"<s>[INST] Hello, how are you? [/INST]I'm doing great. How can I help you today?</s> [INST] I'd like to show off how chat templating works! [/INST]"

Every model has different special tokens (e.g. [INST]) and system prompts (e.g. <<SYS>>). This fortune cookie reviews these prompts for three popular model classes and how they can be used.

GPT (OpenAI)

OpenAI uses the chat completion interface extensively and has a ‘system’ role for injecting system prompts. The rest of the interface uses user and assistant. GPT is proprietary and to my knowledge hasn’t revealed it’s special tokens (correct me!) - but some digging shows it uses < |bos>, < |eos>, < |unk>, < |pad>, < |sep>, < |cls>, < |mask>. It is more important to know that you can prompt GPT models using the system role in the chat interface.

领英推荐

? Time for LLMs?

Pascal Biese 1 年前

?? Getting RAG Right: All in One Go

Pascal Biese 8 个月前

Measuring Reasoning of ChatGPT; Breakthrough…

Danny Butvinik 1 年前

{"messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello, who are you?"},
    {"role": "assistant", "content": "Hello! I'm an AI assistant created by OpenAI. How can I help you today?"}
]}

Llama (Meta)

Llama models are trained on four roles - system, user, assistant, and ipython (as of 3.1). It uses the? <|start_header_id|>ROLE<|end_header_id|> special tokens to inject this into the request. The tokenizer takes care of this with the chat template, e.g.

chat = {"messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"},
]}

tokenizer.apply_chat_template(chat, tokenize=False)

>>>

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 23 July 2024

You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>

What is the capital of France?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

You can reveal the special tokens for any open source model using this snippet with the transformers library, or by looking at the special_tokens_map.json in the files for each model on Hugging Face.

tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
special_tokens = tokenizer.all_special_tokens

Claude (Anthropic)

Anthropic uses a similar chat completion interface as OpenAI and the system prompt can be added via the system role. While Anthropic models are also proprietary, they are more open about special tokens and prompts. It responds well to XML tags like <function><\function>, etc. It may use <claude_info> as a system prompt, but I haven’t been able to find anything definitive here. Anthropic also publishes a nice prompt library.

Akshay G.

Cloud Engineer | Azure | Google Cloud | Infrastructure & Automation Expert

3 个月

Whats your prefered LLM ? Open AI or Claude AI?

查看更多评论

要查看或添加评论，请登录

Scott McKean的更多文章

Databricks Logging and Debugging

2025年3月2日

Databricks Logging and Debugging

Let’s talk about logging on Databricks, specifically in Notebooks, Spark, and Ray. Effective logging is critical for…

4 条评论
DS Fortune Cookies: FTI Architecture

2025年1月13日

DS Fortune Cookies: FTI Architecture

Three sisters dancing in endless flow, feature, train, and infer they go! I read the LLM Engineer's Handbook over the…
Azure Databricks CI/CD

2024年12月31日

Azure Databricks CI/CD

This is an opinionated article on continuous integration and continuous delivery (CI/CD). These are specific practices…

5 条评论
DS Fortune Cookies: LangChain, Agents, and Authentication

2024年12月24日

DS Fortune Cookies: LangChain, Agents, and Authentication

“Embrace LangChain's evolution and your spirit will be unbreakable, unlike your code.” This fortune cookie clarifies…

2 条评论
An Opinionated Primer on Fine-Tuning

2024年12月2日

An Opinionated Primer on Fine-Tuning

Databricks Week 18 I'll admit that when I first heard about 'small language models', I thought it was a ridiculous fad.…

4 条评论
Text Similarity

2024年11月14日

Text Similarity

Databricks Week 16 This week I had the pleasure of speaking with a couple of customers that want to compare two bits of…

1 条评论
100 Days at Databricks

2024年11月9日

100 Days at Databricks

As I hit the 100-day mark at Databricks, I want to review the journey so far with some of the bigger themes that stood…

6 条评论
Anomaly Detection

2024年10月30日

Anomaly Detection

Databricks Week 12/13 I was asked to help a customer out with anomaly detection. I brushed off some of the thoughts I…

4 条评论
Forecasting Deep Dive

2024年10月15日

Forecasting Deep Dive

Databricks Week 10/11 Today is the day - I’m going to really let myself talk nerd. Let’s dive into time series…

2 条评论
DS Fortune Cookies: Liquid AI

2024年10月2日

DS Fortune Cookies: Liquid AI

"When time is of the essence, closed-form solutions make all the difference." Liquid AI introduced a novel class of…

1 条评论

See all articles

DS Fortune Cookies: System Prompts

Scott McKean

Specialist Solution Architect - Data Science

GPT (OpenAI)

领英推荐

Llama (Meta)

Claude (Anthropic)

Scott McKean的更多文章

社区洞察

其他会员也浏览了

Watch#7: Small Tweaks with Big Impact

??Top ML Papers of the Week

RAG Explained: How to Enhance Large Language Models with Powerful Retrieval Techniques

The System Prompt Behind The Prompt Generator...

natlagram: How We Translated Words to Diagrams With the Help of GPT and Kroki

Is OpenAI’s O1 Model a Scam? An In-Depth Look at the Debate

Eliciting In-context Retrieval and Reasoning for Long-context Large Language Models

Semantic Kernel: Unlocking the Mysteries of Machine Language Understanding

Qwen Truth about embeddings for RAG Hype

Agentic RAG solution for LLMs which can understand PDFs with mutliple images and diagrams

GPT (OpenAI)

领英推荐

Llama (Meta)

Claude (Anthropic)

Scott McKean的更多文章

Databricks Logging and Debugging

DS Fortune Cookies: FTI Architecture

Azure Databricks CI/CD

DS Fortune Cookies: LangChain, Agents, and Authentication

An Opinionated Primer on Fine-Tuning

Text Similarity

100 Days at Databricks

Anomaly Detection

Forecasting Deep Dive

DS Fortune Cookies: Liquid AI

社区洞察

其他会员也浏览了

Watch#7: Small Tweaks with Big Impact

??Top ML Papers of the Week

RAG Explained: How to Enhance Large Language Models with Powerful Retrieval Techniques

The System Prompt Behind The Prompt Generator...

natlagram: How We Translated Words to Diagrams With the Help of GPT and Kroki

Is OpenAI’s O1 Model a Scam? An In-Depth Look at the Debate

Eliciting In-context Retrieval and Reasoning for Long-context Large Language Models

Semantic Kernel: Unlocking the Mysteries of Machine Language Understanding

Qwen Truth about embeddings for RAG Hype

Agentic RAG solution for LLMs which can understand PDFs with mutliple images and diagrams