Generating the 2024 coalition agreement with AI
? ANP

Generating the 2024 coalition agreement with AI

OpenMaze in the media

Last week, we had the privilege of being brought up in the media by RTL and more (even landing a spot in the national news ??). Where Max van Hattum and Ruben Fricke were seen presenting our newest tool!

This cool piece of tech generates hypothetical coalition agreements based on selected parties and their respective manifestos/programmes.

We made this work by leveraging the power of generative AI (Large Language Models specifically). How exactly? That's what we will be discussing today in a few concrete high-level steps.


1. Identify main topics

Our tool works by using LLMs and the manifestos of parties. But that doesn't tell us much. We are (sadly enough) unable to plainly dump all selected manifestos into a language model and expect it to instantly generate a cohesive and accurate coalition agreement. There are several bottlenecks that make this impossible.

Instead of generating an entire agreement at once, we break it up into steps. This way the model does not have to generate 10 paragraphs about different topics at once. We identified a list of topics that were present in most (if not all) party manifestos. We then used these topics as a baseline to generate a coalition agreement.

A list of all topics that we identified in most party manifestos

2. Process party manifestos

Once we have a list of topics that we want to base our coalition agreement on, the next step is to find relevant information from each party manifesto. Based on the selected parties, we will look through their manifesto PDF, and find the most relevant pieces of text.

In order to properly search through a large body of text, we use a technique called similarity search (also known as vector search / cosine similarity). In order to use this technique, we first have to process the manifesto PDF. The processing is as follows

  1. Slice the PDF into separate paragraphs.
  2. Use an embedding model to turn each paragraph into a vector
  3. Store the vector in a database

Thanks to these 3 steps, we now have our entire manifesto stored in a database. Additionally, stored in such a way that a computer can understand what each paragraph is roughly about.

Embeddings, stored in

Confused?

The end result is a database filled with vector embeddings. We turn the text into vectors, as computers are far more efficient at working with vectors than text. These vectors are effectively points in space, and points that are close to each other are more likely related. You can play around with this concept on this website. Just click on a point and see all the related points on the right-hand side

3. Phew, we can search at last

After properly processing the party manifestos we can use similarity search per topic that we identified at step 1. We use this to find the most relevant paragraphs in each party program, let's say we retrieve the top 3 based on relevancy.

Per party we can now retrieve the top 3 relevant paragraphs about a given topic. A language model is then used to answer the question: "What does [party x] think about [topic], based on [retrieved paragraphs]?". Which the language model neatly does.

Instead of using the entire manifesto as input to answer this question, we can use the top 3 relevant paragraphs from the manifesto instead. Neat!

4. Bring it all together

Thanks to step 3, we can now get a cohesive paragraph for each topic, for each party manifesto. Now it's just a matter of combining them with other parties, in order to generate new coalition agreements. Luckily for us, language models excel at this too!

Time to combine!

For every combination, we can ask a language model to combine these cohesive paragraphs of the selected parties into one. We combine these paragraphs per topic, to make sure that the model stays on topic. Sadly enough just asking a language model to combine these paragraphs does not always suffice, which is where prompt engineering joins the fun.

AI models have a habit of being somewhat unpredictable now and then, and language models are no exception. In order to ensure that our model stays on path we give it some carefully crafted instructions along the lines of:

You are a political expert and have been given a list of the opinions of political parties which they seek to combine. Use your political knowledge to ensure that every party is represented in your result.

In the end, we tie it all together in a nice document, which you, the reader, can then see at the bottom of the page.

More data?

This setup has the potential to include way more data than just the manifestos. One of the ideas that came to mind is to weigh the summary relative to the party size. Another was to use the recent news as an added factor of leverage for specific parties / topics.

Have any other suggestions? Let us know in the comments!


Our vision on AI

We believe that one of the amazing use cases of language models is to enhance the amount of (textual) information that we can process at once. Seemingly large and important documents are all around us, yet almost never do we actually take (or have) the time to read and analyze them in depth if it's not your job to do so (party programmes, terms and conditions??, policies, laws etc.). AI can make these types of information more accessible to the general public, removing the need for specialized knowledge.

We do not believe that AI (with its current capabilities) should be used to generate entire coalition agreements, or replace entire functions. Instead using it for a framework/baseline/brainstorm currently look like the most beneficial ways to apply techniques like these.

Special thanks

We had a blast during this project, which provided an exceptional result (or so we think). For that, we want to thank Frans Mouws in particular for connecting us to the right people, and giving us the freedom and trust to work on this how we thought was best.

Frans Mouws

Kwartiermaker Fontys ICT Start up incubator programma

11 个月

well done, looking forward to our next projects!

要查看或添加评论,请登录

OpenMaze的更多文章

社区洞察