My two cents on LLMs and Optimization

I’d like to weigh in briefly on a Harvard Business Review (HBR) article by former colleague, Prof. David Simchi-Levi, and some criticism it received on social media from Joannes Vermorel (and from others who piled on in the comments section).??

Before I get into specifics, I’d like to state that the insulting tone of Joannes's piece is uncalled for. There is no need for phrases like “drivel,” “lazy,” and “bullsh*t.” They don’t inform anyone and only lower the quality of our discussion.

I’m going to focus this article specifically on the use of large language models (LLMs) for optimization modeling. This is an area that David and I collaborated on for many years, and thus I can guarantee Joannes that we do indeed have experience.

I agree that the HBR article is optimistic. To the extent that it implies non-technical people can use LLMs to modify Mixed Integer Programs, it’s just wrong. That said, I’m not entirely sure that’s a fair implication. The HBR article acknowledges that LLMs make mistakes. Specifically regarding math modeling, the article states that automated verification of correctness is an open question. I took this to mean the proposed use of LLMs, at least in this area, would be under the supervision of a technical expert (i.e., a math programmer) who would bear ultimate responsibility for correctness.

It certainly seems reasonable to me that an LLM would be able to study a codebase dedicated to mathematical optimization and create the proposed changes for adding a new constraint or objective function component. For example, a few years ago I created a short Python course for building strategic network design models. Recently, I’ve fed the exercises from the course into ChatGPT (both o1 Pro and 4o) and the results were impressive. 4o created a few bugs and o1 Pro was nearly perfect. So the idea that an optimization engineer would use an LLM to more quickly address such a task strikes me as realistic, not science fiction.

I’ll add a caveat here: so long as the optimization engineer is using a very popular programming language. An LLM is not going to be equally skilled with every language. The more online examples and discussions exist on the internet, the more training data is available to the LLM. This inevitably results in the LLM’s preference for languages that rank high on the TIOBE index. For my specific exercises, the LLM was programming in Python, the most popular language in the world.

This brings me back to the part of Joannes's article that strikes me as less than forthright. Joannes touts the benefit of his domain-specific language which is allegedly 10x more concise than Python. Considering that Python often has the brevity of pseudo-code, I’m not sure such a claim, even if true, is a net benefit to the maintainability of a codebase. In the words of Einstein, something should be “as simple as possible, but not simpler.” My assessment is that Python itself has about as much brevity as anyone wants in a programming language.

Of course, maintaining fewer lines of code is a net benefit, and Python programmers typically achieve this goal through libraries. The Python open-source ecosystem is famously vast, and for domain-specific tasks like supply chain it can be enhanced by proprietary libraries maintained in private repos. I expect Lokad boasts a rich set of such libraries, or has functionality that could be supported in a library built into their language proper. So I’ll grant Joannes his claim that for certain tasks he can achieve a 10x reduction in codebase relative to a Python solution that is supplemented only by the PyPI libraries. I don’t think it's fair to say such a reduction is guaranteed against any Python solution. The history of Python, after all, is rife with successful reverse engineering of libraries that originated in other languages.?

This is all a bit of a digression from the larger point at hand, which is the role of LLMs. I think it's a fair assumption David is proposing that LLMs be used to address programming tasks for codebases written in languages very high on the TIOBE index. As I mentioned earlier, I have no expectation that ChatGPT is as fluent in Lokad’s language (called Envision) as it is with Python. The size of the training data here simply cannot compare.?

This brings me to my final point. Regardless of whether you are an LLM optimist or a skeptic, I think it’s safe to say that LLMs are bad news for niche proprietary programming languages. As companies move from spreadsheet logic to codebases, tools like ChatGPT will contribute to the same network effect that websites like Stack Overflow did ten years ago. If you are using a popular language, you get better support, which increases usage of that language, and thus improves the support for those that follow.?

I should conclude by putting my own cards on the table. I, too, work for a software company in the supply chain space that helps clients move toward programmatic solutions. Our libraries and our apps are nearly all in Python (with some Rust scattered about for performance). Our clients often run those solutions on our web-based platform (Foresta). As needed,?these solutions are run in their own playground, without our supervision. Our clients are comforted by the knowledge their business logic isn’t trapped in a programming language owned by anyone, nor is it caged by a platform that guards against portability.?

And now, with the rise of LLMs, they are further comforted that their business logic can be explained to them by an affordable automated tool. Five years ago, such an idea was science fiction. Today, it is real.

Chad Smith

Demand Driven Thought Leader

1 个月

People quoting themselves to reinforce their argument for or against something? That's what I LEGIT. ??

Joannes Vermorel

CEO & Founder at Lokad, Quantitative Supply Chain Software

2 个月

Peter Cacioppi This response is missing the point. Quoting my own writing (sic) "To this date, the percentage of companies enjoying a “unified monolithic codebase” to derive their?supply chain decisions?is virtual nil. While LLMs could, conceptually, edit and improve a messy pile of half-obsolete spreadsheets, until proven otherwise, this is pure speculation." Yes, an LLM can compose an OR script. But, the OR script is nowhere close to the bottleneck of a supply chain initiative. The "rest" (non-OR) is arguably between 99% and 99.9% of the work. Show me that an LLM can do this part... until then, I remain skeptical. Furthermore, exactly as I point out multiple times in my critic, LLM works *if and only if* the human has coding skills, which is exactly *your case*, and the way you are leveraging LLMs in your workflow. The original paper was claiming that LLM would let people with *zero coding* skills achieve the feat (this part is very clear in the original article). Thus, show me a working large-scale implementation of successful supply chain optimization carried with a team that has no coding skills.. until then, I will remain skeptical.

Arvind Singh

Supply Chain Strategy & Analytics | Product Management | IIM Indore

2 个月

Really enjoyed reading this. I also did test on copilot for CVRP -problem using Python & pulp. just to add other business constraints to my existing CVRP - to Most part it does throw output - not 100% correct but gives you base to modify & test. I also came across a trained model -optiMus - using LLM to build MILP problems

Keivan Tafakkori

Ph.D. Candidate @ UT | Industrial Engineering

2 个月

"The more online examples and discussions exist on the internet, the more training data is available to the LLM." Well said!

要查看或添加评论,请登录

Peter Cacioppi的更多文章

  • The Paint-by-Numbers Anti-Art of Modeling

    The Paint-by-Numbers Anti-Art of Modeling

    I recently listened to a Gurobi webcast about generative AI. I can’t remember anything about it, other than Irv Lustig…

    5 条评论
  • Blood, sweat and tears

    Blood, sweat and tears

    Jean-Francois Puget, a friend and former colleague, recently kicked off an interesting conversation with the following…

    3 条评论
  • Building Pythonic MIPs with AMPL

    Building Pythonic MIPs with AMPL

    Update Sadly, as of June 2021, AMPL has failed to follow through on their early promise of connecting their modeling…

    4 条评论
  • Miami Metrorail meets Python

    Miami Metrorail meets Python

    Tallys Yunes recently posted his formulation of the "Buying Metrorail Tickets in Miami" optimization problem. His…

  • Connect OPL to Python with ticdat

    Connect OPL to Python with ticdat

    FYI Subsequent to this post, I've worked with combining AMPL and Python and have found this approach works even better…

    1 条评论
  • ML and MIP Gurobi Webinar Follow Up

    ML and MIP Gurobi Webinar Follow Up

    I recently had the pleasure of collaborating with Dr. Daniel Espinoza on a Gurobi webinar discussing Machine Learning…

  • We few. We happy few.

    We few. We happy few.

    I admit to having strong opinions. As I’ve mentioned in previous blogs, I think Python is the right programing language…

    2 条评论
  • Why Python for MIP? Four Key Points

    Why Python for MIP? Four Key Points

    FYI - I've since recanted the dogmatic positions outlined below. Please go here https://t.

    4 条评论
  • Fantasy Footballers are Nerds Too

    Fantasy Footballers are Nerds Too

    There are two types of nerds. Fantasy football nerds, and Dungeon and Dragons nerds.

  • Retire the Five Grand Old Men

    Retire the Five Grand Old Men

    Every two years, INFORMS gives out an Impact Prize “to recognize contributions that have had a broad impact on the…

社区洞察

其他会员也浏览了