登录查看更多内容

A test harness for #genAI generated code: Using the automated output refinement pattern to enhance and evaluate models

Ajit Jaokar

发布日期: 2024年4月8日

I am using this both in my teaching and in Erdos research Labs to help low code developers to evaluate models

Background

To test AI generated code, I am creating? a test harness using the following strategy which is an adaptation of the automated output refinement pattern from Yi Zhou ’s comprehensive book on Prompt design patterns

The automated output refinement pattern as discussed in the book is itself a variant of the? Self-Refine: Improving Model Quality Under Distribution Shift via Self-Refinement

Notes:

1) while the approach is prompt based, it needs an understanding of the machine learning process and workflow

2) This is not exactly a test harness I know - but you could adapt it as one - by introducing a more structured approach and testing code in stubs. Besides, I liked the pic of huskies running in the harness :)

Objective

There are many approaches to validate AI generated text but not many for AI generated code

My objective is a to create a general test harness for AI generated? code?

I am using this for ML code but I think it should work for anything

Process

Its best to explain it in code

领英推荐

Top Landing AI Highlights of 2023

LandingAI 10 个月前

??Top ML Papers of the Week

DAIR.AI 1 个月前

Artificial Intelligence #210

Andriy Burkov 9 个月前

basic prompt is (after running mnist)

Now, I want you to act as an expert AI developer/ Engineer. to iteratively refine the code. At each stage provide a focused, constructive critique, explain the critique, implement it in the code and then provide the output. Run this refinement iteration process three times.

The only problem is because it does not have keras it cannot run the code but still provides indicative output

See the full conversation and output here ?

If you found this useful, you can sign up for my book

If you are a non developer and want to learn AI with me, please see Erdos Research Labs

You can meet me and our team at our Oxford AI summit

If you would like to study with me, see our courses

Low code AI course at the university of oxford? for non developers

AI and digital twins

Thanks to Anjali Jain Aishwarya Naresh Reganti for their feedback

Image source: huskies

Artificial Intelligence

114,230 位关注者

Yi Zhou

7 个月

Ajit Jaokar, I'm so glad that the prompt design patterns in my book are being used in your work. ??

2 次回应

Mani Sarkar

4X Kaggle Expert, Senior Engineer helping startups with their Data, Data Science, Machine Learning, & Software endeavours

7 个月

Thanks Ajit https://chat.openai.com/share/c6ec6567-db85-48ca-9f3d-a449385c6378 this is still pretty good and there;s plenty of opportunity to add more rigour to get more out of the existing code/model. What would be great if it could show the results and compare the results, track it using W&B and show it for each iteration

2 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

A test harness for #genAI generated code: Using the automated output refinement pattern to enhance and evaluate models

Ajit Jaokar

Background

Objective

Process

领英推荐

Artificial Intelligence

114,230 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

??Top ML Papers of the Week

Artificial Intelligence #156

Artificial Intelligence #156

#artificialintelligence #127: The? Enterprise AGI framework for developing AI use cases

Artificial Intelligence #107

Artificial Intelligence #118

Artificial Intelligence #104

GPT-4o Mini: Affordable AI Intelligence

?? Devin AI has unlocked a new software development era, an overview of the latest LLMs, and the gist of my book

#artificialintelligence #104: Reengineering business processes using LLMs and prompt engineering with the low code/ power apps approach

Background

Objective

Process

领英推荐

Artificial Intelligence

114,230 位关注者

Low Code Data Scientist - learning from Grace Hopper

2024年11月24日

Generative AI in Creative Roles: Best Practices

2024年11月24日

However did Euler come up with the Euler’s identity?

2024年11月23日

AI Opportunities in the new Justice AI Unit in the UK

2024年11月22日

Artificial Intelligence: Generative AI, Cloud and MLOps (online) - an amazing set of speakers

2024年11月21日

My new role - Senior AI fellow - Justice AI Unit - Ministry of Justice - UK Government

2024年11月20日

Securing an AI model

2024年11月17日

Auditing and Securing an AI model

2024年11月15日

An easy way to learn Python coding using chatGPT - part two

2024年11月13日

AI - Research Perspective - A Beginners Guide to Cursor and Claude-3.5-Sonnet

2024年11月11日

社区洞察

其他会员也浏览了

??Top ML Papers of the Week

Artificial Intelligence #156

Artificial Intelligence #156

#artificialintelligence #127: The? Enterprise AGI framework for developing AI use cases

Artificial Intelligence #107

Artificial Intelligence #118

Artificial Intelligence #104

GPT-4o Mini: Affordable AI Intelligence

?? Devin AI has unlocked a new software development era, an overview of the latest LLMs, and the gist of my book

#artificialintelligence #104: Reengineering business processes using LLMs and prompt engineering with the low code/ power apps approach