Let's talk about "prompt engineering"
Prompt Engineering is all the rage these days. Hustling job seekers are scrambling to add it to their resumes, while companies are looking anxiously over their shoulder lest that their competitors leapfrog them. This article explains in simple words what prompt engineering is about, and gives some suggestions how to learn it well or hire for it.
Last time, we demystified the usage of Large Language Models (LLM), which are powering the new crop of generative AI services. In a nutshell, an interaction starts by a user providing some text (a prompt), and then a magic box generating some more text (completion) that looks likely to follow the prompt, according to some statistical parameters.
The secret sauce is those statistical parameters - we call them a model, and they are extracted from millions of pages that have been fed to a special training algorithm (GPT4's training corpus weighs 45Gb, which amounts to 30 million pages and change).
As you may have guessed, Prompt Engineering is the discipline studying how to create good prompts for getting useful completions from a model. It also extends to understanding the portfolio of services available to you from one or more vendors, as well as selecting the best backing model implementation and parameters to optimize for cost, time and accuracy.
The prerequisite skills you need are good old experimental design, knowing how to evaluate model performance using ROC curves, cross-validation, and other basic skills from The Science Formerly Known As "Statistics". If these sound foreign to you, do yourself a favor - put off prompt engineering for a week, and work on the foundations first.
Then you need to learn a bag of tricks - all of them amount to arranging a proper context that will make it more likely that the model spits out a useful completion.
After that, there is practice, practice, practice. Try to fail cheaply, because you will, preferably at your employer's expense. Doing it as part of a job has the advantage that you get to apply your knowledge early and get some real feedback on how well you are doing. Joining a community like Kaggle, is a great way to grow, but it also requires significant investment, and I find many people in the industry don't have the time for that.
Looking up some math
Let's try to get the Open AI completion service to find us a solution to basic differential equation.
Example: when I prompt with "Solve a'=2b", the GPT3 model returns "a=2b", which may be the most plausible completion, but not what I asked - that "prime" was there for a reason.
Knowing that most differential equation textbooks would use x, y and z as variables, I changed my prompt to "Solve y'=2x" which yielded the correct answer "y = x^2 + C, where C is an arbitrary constant." Bingo! I am a prompt engineer!
Let's try with reversing the variables "Solve x'=2y" - the answer is weird:
Completion: "Let y=c.
Then, x' = 2c
Integrating both sides with respect to x, x = c x + k, where k is an arbitrary constant."
I am not a mathematician, it is possible that this has profound meaning, but it sounds like a math-babble to me, and is definitely not what I am looking for.
领英推荐
"Explain why"
I tried to provide some more human instructions such as "Solve the differential equation x'=2y for x" and variations, but it did not have much of an effect - we are not talking to someone who can think and can solve equations and the problem is not that it misunderstands the question--we are dealing with a sophisticated guessing machine, and the problem is that according to its parameters this prompt is most likely completed by text that happens to be false.
One trick to improve the guessing effectiveness is to force it to self-add extra context by adding "explain why" to the prompt. This makes our prompt look more like reference material, and in many cases improves the quality of the result - here we go:
Prompt: "Solve the differential equation x'=2y for x.
Explain why."
Completion: "The solution to this differential equation is x = 2y^2 + c, where c is an arbitrary constant of integration.
This solution is obtained by finding the antiderivative of both sides of the equation. Since we are given the equation x' = 2y, the antiderivative of the right side is 2y^2. Therefore, the antiderivative of the left side (x') is x, which results in x = 2y^2 + c."
This is what I refer to as "trick" - it is useful, it works, and it is clearly a compensation for a lack of sophistication of the model, and that by itself is a moving target.
Is it worth it?
Today's prompt engineering skills are bound to become obsolete tomorrow, as models become more ergonomic and the value-add of knowing how to turn the ask into a prompt diminishes.
Long ago, in a galaxy not so far away, assembly developers who would occasionally drop to writing machine code were commonly sought and worth their weight in gold. Today, they probably still are - if they can find a job at all...
Even if the next-generation models would still benefit from constructing the prompts in a specific way, it is unlikely that the practices of today will be universally applicable, as they are likely overfitted to the curent crop of models. A practitioner is bound to play a catchup and dare I say - that is a young-people's game.
That said, I urge everybody to get their feet wet - the promptingguide.ai is an excellent resource for familiarizing yourself with the concepts and considerations. Also Microsoft's Guidance DSL is a very approachable way to experiment with all free variables and construct sophisticated pipelines, mixing AI-assisted queries, Python data wrangling, machine learning, and free text.
How to hire good prompters?
Designing a good prompting strategy may be a young discipline, but evaluating whether the result of a prompt is good, and systematically improving it is a very established science. A good prompt engineer needs to be a good data scientist/statistician.
I would be weary of people who try to dazzle me in an interview with their knowledge of shibboleths such as CoT, Zero/Single/Multi-Shot prompting, but struggle to explain what "lift" is, or how different cross-validation strategies work and why.
Furthermore, a good candidate should show pragmatism, balancing precision, robustness and time to develop. Awareness of the business and total cost of ownership, normalized by the scale of usage is very important. Security consciousness is also a big plus - you don't want your customer support chat-bot to spend precious CPU minutes tricked into completing blockchain hashes.
When you hire a candidate with these qualities, even after the fad of slapping generative AI on everything blows over, you will still have a perfectly good data analyst with solid foundations and broad interests, who can contribute value to your company for years to come.
Head of Asset Management Technology, Japan at UBS
1 年Very enlightening! Had never heard of P.E. before. As an active assembly developer (at home) I didn’t like that analogy, but it’s a reasonable one to make :)