Using LLMs to translate optimisation problems between natural and problem-specific languages
Thomas Moran and N. Petrov-Vodkin (via Wikimedia)

Using LLMs to translate optimisation problems between natural and problem-specific languages

TL;DR

Recent dialogues with Operations Research (OR) experts revealed the underutilization of this applied mathematics field in practical Operations applications, despite its potential. The author experimented with Large Language Models, such as ChatGPT-4, to generate Mathematical Programming code for a Facility Allocation problem, aiming to bridge the gap between OR theory and practical application for non-experts. Experiment demonstrates that LLMs can translate plain language formulation of optimisation problems into syntactically correct code that simplifies both OR learning and OR application process, though complete accuracy and fine-tuning still require technical knowledge.

***

In the past few weeks, I had the opportunity to engage in several enlightening conversations with the experts in Operations Research (OR) field. This management discipline is remarkably low-key even in the 21st century and is seldom discussed among Operations practitioners. This is partly because the scientific field is relatively young. OR emerged as a branch of applied mathematics in the late 1930s, driven by an increasing demand for better use of limited resources in complex industrial systems.

One of the pioneers of OR was Leonid Kantorovich who was appointed as a professor at Leningrad State University (USSR) in 1934 when he was just 22 years old. Kantorovich made many contributions to the OR field and his work was honoured with a Nobel prize in 1975. One of the applications of Kantorovich's work in OR is especially dear to me. In 1941-1943 during the Siege of Leningrad a team of scientists including Leonid Kantorovich developed an optimization model to estimate the load-bearing capacity of the ice on the Road of Life across Ladoga Lake. This model allowed determining the optimal distance between vehicles crossing the lake to minimize truck losses.

I learned OR by myself 20 years ago in a traditional way, mostly using manuals like Hamdy Taha’s excellent Operations Research introduction book, and I am still very interested in how people learn and apply OR in practical fields such as Supply Chain and Manufacturing Operations. One of the major shifts in OR field since the “old days” was the emergence of open-source tools including MiniZinc (https://www.minizinc.org/), which greatly improved the Constraint Programming and Linear Programming learning experience. Specifically, helping students to overcome initial frustration related to debugging of the “infeasible” models. Online learning platforms such as Coursera also offer many options to get familiar with Operations Research. Yet, it is difficult to find non-mathematical users of OR methods who are confident to use Optimisation techniques to solve real-life problems.

A well-recognised obstacle for using Mathematical Optimisation methods in Industry is a lack of Lingua Franca, a common language between Operations experts who have little exposure to Mathematical Programming and applied mathematicians who are unaware of the issues faced by Operations professionals. Overcoming this barrier can offer huge benefits for the industry. At least on paper…

I had some spare hobby time over the weekend, so I decided to try if an Operations person who doesn't know much about the OR can create a useful Mathematical Program using Large Language Model as a translator between the language of business and maths. I used Facility Allocation problem as a good example of a task that is both practical and relatively complex to play with. Facility Allocation problem is aimed to address a question of finding an optimal supply chain design where decison-maker can open warehouses and can allocate customers to the warehouses to satisfy the demand. Customers have a demand to be satisfied from a single source, while warehouses have limited capacity, have one-off opening costs and there are transportation costs associated with goods movements between warehouses and customers.

Facility Allocation problem

I compared 3 different LLM engines for the test: ChatGPT-4, Gemeni and Perplexity. MiniZinc was chosen as the target Mathematical Programming language since it is easy to learn for students and free to use.

Key observations:

  • All 3 LLMs were able to produce MiniZinc code of different quality requiring certain number of corrections such as:

  1. Adjusting data types
  2. Using proper comments syntax
  3. Assigning values to multi-dimensional arrays
  4. Formulating constraints and objective functions
  5. Producing outputs

  • Interestingly, ChatGPT-4 on several occasions produced syntactically correct programs with test datasets that ran on MiniZinc IDE without any editing. I chose this LLM for further experiments. However, ChatGPT-4 was the only LLM that was "refusing" to perform some calculations because they worsened the MiniZinc model performance.
  • I experimented with framing the problem with a series of partial prompts and one big prompt, and then giving corrective prompts when necessary. It appears that the outcomes are more reliable with a "big prompt first" method.
  • The best structure of the prompt I was able to come up with was as follows:

  1. Expectations about the code and coding style covering comments, use of arrays - not a problem specific part and can be reused among models
  2. Clear and concise problem statement
  3. Description of the parameters
  4. Decision variables
  5. Constraints
  6. Objective function
  7. Expected outputs

Write a fully compatible and syntactically correct MiniZinc 2.8.x program. Code should be self-documented and human-readable. Separate each block of the code with a comment line filled with 78 dash symbols. Use % as a comment symbol. Use simple data types. The syntax of the 2D arrays should be as follows: 

var_arr = 
[|a11, a12
|a21, a22 
|a31, a32|]
		   
Provide result with test data ready to be copy-pasted into MiniZinc.

Problem statement:
As a logistics manager, your task is to decide on the opening of warehouses and allocating customers to those newly opened warehouses. Your objective is to minimize sum of opening costs and transportation costs. You should respect maximum capacity of the warehouses to serve customer demand. Each customer should be linked to a single opened warehouse. All customer's demand should be satisfied.

Problem parameters:
1. Set of warehouses
2. Set of customers
3. Warehouse opening costs
4. Warehouse maximum capacity
5. Customer demands
6. Cost of transporting a unit of demand from each warehouse to each customer

Decision variables:
1. Warehouses to be opened
2. Feeder links between customers and warehouses to be established after opening warehouses

Constraints:
1. Each customer must be allocated to exactly one open warehouse
2. Customers can only be allocated to open warehouses
3. Warehouse capacity must not be exceeded
4. All the customer demand must be satisfied

Objective function:
Minimize sum of warehouse opening costs and transportation costs (multiplication of unit cost by transported demand)

Outputs:
1. Value of the objective function
2. List of opened warehouses
3. List of customers allocated to each of the warehouses
4. Warehouse capacity utilisation        

  • The language to define a problem is very important. Simplicity is key. The hardest part was to state the constraints clearly. Everyday speech has many hidden assumptions that LLMs don't understand. For example, saying “allocate customers to warehouses” means that warehouses have to be opened to serve customers. LLM doesn't know this. But when you give LLM detailed and somewhat redundant constraints, LLM can combine them into one correct math expression (not every time, though).
  • The way the outputs section description is written matters a lot because all LLMs seem to struggle with output code driven possibly by limited training data for LLMs.

Conclusions:

  • LLMs have reached a level of maturity where they became a valuable tool for industry to support efficiency improvement initiatives.
  • Getting 100% syntactically correct code is still a challenge, so some knowledge of the mathematical programming is required.
  • Training Operations professional in using LLM models for Mathematical Optimization and supporting them with part-time OR experts may be a way to address limited OR expertise available.

Example of the syntactically correct, yet logically flawed, ChatGPT-4 MiniZinc output for Facility Allocation problem:

%-------------------------------------------------------------------------
% Problem Parameters
%-------------------------------------------------------------------------

% Set of warehouses
set of int: WAREHOUSES = 1..3;

% Set of customers
set of int: CUSTOMERS = 1..4;

% Warehouse opening costs
array[WAREHOUSES] of int: openingCosts = [100, 150, 200];

% Warehouse maximum capacity
array[WAREHOUSES] of int: maxCapacity = [15, 20, 10];

% Customer demands
array[CUSTOMERS] of int: demand = [5, 10, 7, 8];

% Cost of transporting a unit of demand from each warehouse to each customer
array[WAREHOUSES, CUSTOMERS] of int: transportCosts = [|
  2, 3, 1, 4 |
  3, 1, 2, 3 |
  4, 2, 3, 2 |];

%-------------------------------------------------------------------------
% Decision Variables
%-------------------------------------------------------------------------

% Indicator variable for warehouses to be opened (1 if open, 0 if closed)
array[WAREHOUSES] of var 0..1: open;

% Feeder links between customers and warehouses (warehouse index for each customer)
array[CUSTOMERS] of var WAREHOUSES: allocatedWarehouse;

%-------------------------------------------------------------------------
% Constraints
%-------------------------------------------------------------------------

% Each customer must be allocated to exactly one open warehouse
constraint forall(c in CUSTOMERS)(
  open[allocatedWarehouse[c]] = 1
);

% Customers can only be allocated to open warehouses
constraint forall(c in CUSTOMERS)(
  open[allocatedWarehouse[c]] = 1
);

%% [Ilya]: ^^^ duplication of the constraints

% Warehouse capacity must not be exceeded
constraint forall(w in WAREHOUSES)(
  sum(c in CUSTOMERS where allocatedWarehouse[c] = w)(demand[c]) <= maxCapacity[w]
);

% All the customer demand must be satisfied
constraint sum(c in CUSTOMERS)(demand[c]) = sum(w in WAREHOUSES)(maxCapacity[w] * open[w]);

%% [Ilya] incorrectly formulated constraint. What should be instead:
%% constraint sum(c in CUSTOMERS)(demand[c]) <= sum(w in WAREHOUSES)(maxCapacity[w] * open[w]);

%-------------------------------------------------------------------------
% Objective Function
%-------------------------------------------------------------------------

% Minimize sum of warehouse opening costs and transportation costs
var int: totalCost = 
  sum(w in WAREHOUSES)(openingCosts[w] * open[w]) +
  sum(c in CUSTOMERS)(transportCosts[allocatedWarehouse[c], c] * demand[c]);

% Define the objective
solve minimize totalCost;

%-------------------------------------------------------------------------
% Output
%-------------------------------------------------------------------------

% Value of the objective function
output ["Total Cost: ", show(totalCost), "\n"];

% List of opened warehouses
output ["Opened Warehouses: ", show([w | w in WAREHOUSES where open[w] = 1]), "\n"];

% List of customers allocated to each of the warehouses
output ["Customer Allocation: "] ++
  [show(c) ++ "->" ++ show(allocatedWarehouse[c]) ++ " " | c in CUSTOMERS] ++
  ["\n"];

% Warehouse capacity utilisation
output ["Warehouse Capacity Utilisation: "] ++
  [show(w) ++ ": " ++ show(sum(c in CUSTOMERS where allocatedWarehouse[c] = w)(demand[c])) ++ "/" ++ show(maxCapacity[w]) ++ " " | w in WAREHOUSES] ++
  ["\n"];

%-------------------------------------------------------------------------
% Example of output (with incorrect constraint):
%-------------------------------------------------------------------------
% Total Cost: 423
% Opened Warehouses: [2, 3]
% Customer Allocation: 1->2 2->3 3->2 4->2 
% Warehouse Capacity Utilisation: 1: 0/15 2: 20/20 3: 10/10 
%-------------------------------------------------------------------------
% Example of output (with corrected constraint):
%-------------------------------------------------------------------------
%Total Cost: 301
%Opened Warehouses: [1, 2]
%Customer Allocation: 1->1 2->2 3->1 4->2 
%Warehouse Capacity Utilisation: 1: 12/15 2: 18/20 3: 0/10 
        

要查看或添加评论,请登录

社区洞察

其他会员也浏览了