Synthetic data generation for email summarization tool testing

Synthetic data generation for email summarization tool testing

Currently I am working with some students on a project (let's call it tool-1), it is being designed to have the following features

  1. Pick a set of emails based on sender, timestamp, etc.
  2. Let user select important themes/projects for that week
  3. Remove toxic content from emails
  4. Summarize the emails based on themes/projects

If done right, it would be very attractive tool for senior executives.

The problem that we were facing was that how to test the solution?? Then a friend of mine gave an idea!? The idea was that we develop a tool (lets call it it tool-2) which takes summarized emails with the above qualities (especially the themes/projects part), then it would split them into multiple emails.??

We will then use tool-2 to generate lots of emails and test tool-1.? If we can arrive at the same/similar summary then we have a reasonable way to check the quality of tool-1.?

Shows the power of synthetic data, just like the AlphaGo team got their Machine Learning algorithm to play against a different version of the same algorithm.? This way they were able to play these algorithms against each other millions of times, coming up with moves that the grand master also did not know about.

Do reach out if you want to discuss more ideas around synthetic data generation.




Mehrab Momin

Your Startup's Fractional AI CTO | Generative AI & LLM | Computer Vision | Machine Learning

4 个月

Are you going to use two different LLMs for each tool to avoid bias?

要查看或添加评论,请登录

Ansar Muhammad, PMP, PSM-1的更多文章

  • How AI Assistants Help With Programming

    How AI Assistants Help With Programming

    There are many good high quality AI coding assistants available like GitHub Copilot, Cursor, Windsurf, etc. I recently…

  • Data Migration Strategy

    Data Migration Strategy

    A lot of companies treat data migration as an after thought. The new system is ready for GO LIVE but data migration…

    1 条评论
  • Building an AI Agent using a No-code tool

    Building an AI Agent using a No-code tool

    Wanted to show a very simple use case for people who don’t want to dabble in Python programming. There is a very decent…

    10 条评论
  • How LangChain can help you elegantly write an Agent!

    How LangChain can help you elegantly write an Agent!

    In this example, we will use LangChain help us write an elegant solution. I have intentionally taken a simple problem…

    3 条评论
  • Roundtable to discuss day zero employability of IT graduates

    Roundtable to discuss day zero employability of IT graduates

    Recently got a chance to participate in a Roundtable to discuss day zero employability of IT graduates at IBA. The…

    2 条评论
  • Join Us as a SME – USA Health Insurance! ??

    Join Us as a SME – USA Health Insurance! ??

    Are you passionate about transforming the health insurance domain with innovative software solutions? ?? Do you thrive…

  • AI Convergence Conference

    AI Convergence Conference

    It was a pleasure to speak at the conference! Met many interesting people, very nice to see a well executed AI…

    5 条评论
  • 3rd Code Quality Awards

    3rd Code Quality Awards

    “?? Celebrating Excellence in Code Quality Quarter 3 ?? Today, we come together to recognise and celebrate the…

  • Agile frameworks and MVP approaches

    Agile frameworks and MVP approaches

    Presenting Agile and MVP concepts to the receptive JazzCash team was a great experience! The way the training was…

    2 条评论
  • Using Machine Learning / Decision Trees to understand your data

    Using Machine Learning / Decision Trees to understand your data

    Sometimes it is hard to understand your data without the help of some Machine Learning tools, one very useful open…

    1 条评论

社区洞察

其他会员也浏览了