Steering Generative AI: From Small Models to Supercharged Prompt Engineering
A brave explorer entering the wild seas of Generative AI

Steering Generative AI: From Small Models to Supercharged Prompt Engineering

The recent explosion of generative models has unleashed unprecedented interest in the field of AI. Without a doubt, their range of applications is so extensive that practically no area is left uncovered. According to a McKinsey study, most of the value attributed to generative AI would be concentrated in the fields of customer service, marketing and sales, software engineering, and R&D. We'd be hard-pressed to find any reader who is not busy in one of these sectors.

Interestingly, the jobs that typically enjoy the highest salaries and demand a costly education, and that until recently were considered safer from automation, may be the first to undergo profound transformations. Let's take the case of artistic creation: some professionals embrace this new situation, while others not so much.

Be it as it may, and despite the dazzling forecasts, this technology is not without possible roadblocks: from the legal, environmental or even existential point of view. It seems clear that we are going to witness a movement towards greater transparency in terms of training data, as already announced in the new EU AI Act, while we strive to make a more responsible use of costly computing resources.

Precisely one path towards reducing the carbon footprint (while reducing the bill for cloud services) may be the fine-tuning of medium or small open source models for specific purposes, a rising trend judging by HuggingFace's Open LLM Leaderboard, which by the way is constantly showing seismic swings with the emergence of new foundation models that improve the state-of-the-art, the latest being Llama 2, dethroning the recent Falcon.

Even more interesting are the small, custom models made for a very specific purpose. Recent works by researchers affiliated with Microsoft demonstrates that, with a carefully selected training dataset, we can obtain coherent results from very small models, even smaller than previously thought possible. After all, what we thought we knew about the scalability laws of large language models may be in question. The fact that the generation of synthetic data could end up being key to achieving great results in different areas is of increasing relevance here.

Regardless of model size, large or small, it remains crucial to invoke it skillfully. We refer to the well-known "prompt engineering", which is the new programming, since traditional programming could be obsolete soon (although maybe not). In this sense, another group of researchers, also affiliated with Microsoft, are pioneers in the fine handling of prompts, which allow for a more efficient steering of the output of a generative language model.

With a little effort and planning, a lot can be done with this technique, or similar ones, to decrease the dreaded "hallucinations" (we still don't know what psychedelics current language models prefer, maybe this will end up being a new line of research). As an aside, these techniques work better on open source models, due to direct access to their logits. And if all else fails, we can always intervene within the neural network and turn a "hater" into a "lover" through direct manipulations of the learned weights (we take no responsibility for the side effects of such experiments, fortunately we have not yet granted consciousness and rights to neural networks).

In any case, the possibilities are tremendous: you just have to arm yourself with plenty of prompts, some computing credits, and the daring to venture into the wild seas of generative AI. May the wind be at your back!


-- This article was first published in the July newsletter of Spain AI . The above text has been translated from the Spanish original with Claude 2.0 and slightly revised for style.

-- The hero image has been generated with Kandinsky 2.2 with the prompt "a small ship navigating through wild seas, chaotic sea setting, ocean, The Great Wave off Kanagawa, ukiyo-e, night setting, full moon, purple and blue hues, sci-fi, futuristic, cyberpunk, neon, dramatic lighting"

要查看或添加评论,请登录

Adrian Tineo, Ph.D.的更多文章

  • Navigating the AI landscape: a guide for startups

    Navigating the AI landscape: a guide for startups

    AI has finally come of age. After a cycle of winters and summers, often with flopped, over-hyped promises, the current…

    1 条评论
  • Main takeaways from our panel on Open Source AI

    Main takeaways from our panel on Open Source AI

    Summarizing the main takeaways from our panel session at OpenSouthCode with Ezequiel López-Rubio and Francisco J…

    1 条评论
  • Changing careers is hard, but you can do it

    Changing careers is hard, but you can do it

    In 2015 I was a test manager and I desperately wanted a career change. A friend told me about mobile develoment…

    2 条评论

社区洞察

其他会员也浏览了