登录查看更多内容

Real World Modeling: Joint Estimation & Optimization

Srinath Sridhar

CEO & Co-Founder at regie.ai

发布日期: 2018年7月24日

As described in the previous post, one of Onera’s core product offerings involves a problem modeled at a 10,000 foot level as a knapsack problem. Each item is a SKU that a retailer has the option of selling and associated with each SKU is the corresponding revenue (or margin) that the retailer may generate while also having some possible risk associated with offering the item for sale. This risk most typically arises out of operational problems involved in selling this item resulting in possible negative customer satisfaction. The problem of course is to try to achieve maximum revenue while ensuring that the customer satisfaction stays above a threshold.

In reality, these problems have several flavors and models associate with them at each level. As you dig deeper into the problem, each layer reveals an interesting modeling challenge and a problem solving approach much like peeling layers of an onion. So this post dives a little bit deeper into the above problem formulation and models underneath which are typical of algorithmic challenges in the real-world. Specifically we will talk about what we think of as Separate Estimation & Optimization (SEO) methods and Joint Estimation & Optimization (JEO) methods.

At a high-level there are at least three sub-problems:

Given set of items, a function that maps the items to expected revenue and a function that maps the items to customer satisfaction, find a subset of items that maximizes overall revenue while ensuring that the overall customer satisfaction is above a defined threshold
Given historical data on items, revenue generated by the items, customer satisfaction scores of those items, learn functions that map items to expected revenue and items to expected customer satisfaction
If there are items that are ‘new’, ‘unlearnable’ or lacking recent data, device a scheme that ensures that these items are offered by the retailer so as to not miss opportunities just because of historical behavior

Each problem above can be modeled as a classical problem in discrete algorithms or machine learning.

Modeling Problem 1: This can be solved as a classical mathematical programming problem, for example as an integer linear program. The history of the knapsack problem is as old as NP-completeness itself and there is a wide array of literature on solving it in theory and in practice. As illustrated by the previous post, there are ways to solve the problem that provide theoretical approximation guarantees that are in fact unbelievably good in practice as well. In a manner of hand-waving, although the problem is NP-complete, it is as easy to solve within the NP-complete class as it gets.
Modeling Problem 2: This is a standard problem in machine learning. You can pick your favorite learning algorithm and of course in vogue at the moment are neural networks. You pick a bunch of features for each item such as price, price status, brand, size, weight, product ontology and you use historical data to obtain demand (and revenue) predictions as well as customer satisfaction predictions.
Problem 3: This can be modeled as a multi-armed bandit problem which used to be quite popular in the days of ads optimization where very little information is available of a new ad to determine its performance of CTR and that you need to provide it some air-time to figure out crucial metrics. Each choice comes with a known objective function cost and a variance in the estimate and you trade-off between exploiting known good items while also exploring unknown items.

The typical approach to solve the overall problem in practice would be as follows. You would solve Problem 2 first. Again, pick your favorite learning algorithm, historical data and obtain functions that you believe fits them best. Then you will use the functions to solve Problem 1 by using an Integer Linear Program (ILP) solver. Finally you will layer on a solution to Problem 3 by trading off between the ILP solution and unoptimized, unknown variables in a manner that gets you a good trade-off between exploitation and exploration.

We think of these as SEO solutions: Separate Estimation and Optimization. In these solutions, the parameter estimation is agnostic of the optimization. We can also think of solving these problems with what we call JEO solutions: Joint Estimation and Optimization. This requires re-thinking how you would model these problems.

Notice that the more statistical approach to solving Problem 3 would be to use a semi-supervised learning approach where you use labels from Problem 2 to also generate data points and labels for Problem 3. So in practice it is possible to combine Problems 2 and 3 into a single estimation if you did not want to take the multi-armed bandit approach but that would still leave you with the exact same SEO type overall solution.

To come up with a JEO solution we do the following. We begin as before with a solution to Problem 2 and we use neural-network layer to estimate revenue and customer satisfaction functions. Now, we add a new layer into a neural network to perform the optimization part. Instead of your out-of-the-box activation functions such as sigmoid or tanh which just does a quick evaluation of the input data, we solve a constrained optimization problem in the next second layer. This provides estimates on which items a retailer should carry given the estimates from the previous layers on the demand (revenue) and customer satisfaction functions. To be specific the forward pass through the neural network actually finds exactly the same solution as the LP relaxation of the knapsack problem and thanks to the previous article, the solution has at most 1 fractional variable. In parallel in a neural network layer, we estimate functions that can provide synthetic data points and labels for observations that have not been made. This layer in essence solves one step of a semi-supervised learning for Problem 3. The loss function finally now combines the outputs of the previous layers to estimate the total revenue and ensures that total customer satisfaction is above threshold. If it turns out that the customer satisfaction is below threshold, then we re-apply the forward pass and continue to do so until we find a solution that is both feasible and at least in a single step optimized. We then like always compute the gradient associated with the loss function on the change to revenue based on carrying infinitesimal more (or less) of the items. These gradients are back-propagated into the backward pass to learn updated coefficients. Needless to say the loss functions and the functions that are used in the forward pass can be shown to be continuous and differentiable.

We can now carry out a second epoch through the neural network and continue to run it to our stopping criterion. Now why would someone want to re-architect a neural network to do this? The problem of course with standard neural networks is that simple functions can be very well defined and evaluated. But if throw in constrained optimization then practically neural networks out of the box are out the window. This is not a completely new observation and a paper by Amos and Kolter show something similar. If you think the solution to Problem 2 is perfect - then JEO methods may not perform much better than SEO methods. However, the problem with semi-supervised learning is that inherently it is needless to say, not magic. Adding extra data points and labels can come with a cost that while the hope is that the methods converge to a true underlying distribution that could have been truncated (as the case with us), the more labels that are added on the synthetic side comes with larger errors than those of supervised learning. These new synthetic points can then hurt the distribution of the labelled points. However, the whole idea with semi-supervised learning is that overall this is more helpful than harmful. But JEO methods try to exploit more the helpful side while trying to side step as much noise as possible.

The idea with JEO is that you only add synthetic points on a need to know basis. As the optimization is exploring areas of interest, you add those synthetic points and let the estimates converge to new distributions. If there are areas that do not need to be simulated, then you do not generate synthetic data there. Therefore the optimization guides the parameter estimation and as always the parameter estimation guides the optimization and hence: joint estimation and optimization.

This is joint work with Jeremy Karp and R. Ravi of Carnegie Mellon.

Balakrishnan(Murali) Narayanaswamy

5 年

Hey Srinath! Are you describing an actor critic architecture, where the actor has e.g. an OptNet layer?

要查看或添加评论，请登录

Srinath Sridhar的更多文章

Writing Papers at Startup: My Work at Onera

2023年2月20日

Writing Papers at Startup: My Work at Onera

Last month, Carnegie Mellon's wrote an article about our work at Onera, my previous startup. This was based on our…
New Frontiers in AI: The Next Generation of Enterprise Text Editors

2021年7月19日

New Frontiers in AI: The Next Generation of Enterprise Text Editors

I am definitely the last person to drink the kool-aid or jump on a new fashionable technology fad. People who know me…
Corona Virus 7-Day Forecast & An Opportunity to do some Algebra

2020年3月27日

Corona Virus 7-Day Forecast & An Opportunity to do some Algebra

This is the second part of the corona virus simulation and you may be interested in our more extensive simulation of…

1 条评论
Computer Simulations to Show the Impact of Social Distancing in Spreading the Coronavirus

2020年3月17日

Computer Simulations to Show the Impact of Social Distancing in Spreading the Coronavirus

We designed a model simulation to show how diseases such as the corona virus can spread through communities. While the…

14 条评论
A Simple, Elegant Mathematical Structure at the Heart of Fulfillment Supply Chain

2019年12月8日

A Simple, Elegant Mathematical Structure at the Heart of Fulfillment Supply Chain

This has been an article based on an observation I made a long time back but never had a chance to write it up. If you…
Streaming Machine Learning

2019年3月19日

Streaming Machine Learning

This article builds on a few of the previous write-ups. While still independent, and definitely suitable for a general…
What do the Apollo missions, drones and Onera have in common?

2019年2月25日

What do the Apollo missions, drones and Onera have in common?

We don’t think much about computer controlled systems that routinely help everyone. But they are the heroes behind the…
Fast Simple Solutions to the Knapsack Problem

2017年5月22日

Fast Simple Solutions to the Knapsack Problem

I find it rewarding to look at algorithms that have provable guarantees while also being very simple to implement…

5 条评论
The Four Software Engineering Personality Types

2016年2月8日

The Four Software Engineering Personality Types

People are fascinated by personality types, such as the Myers-Briggs (MBTI). Psychological classifications are…

34 条评论
Weekend Hack: Lane Detection, Drift and Self Driving

2015年8月10日

Weekend Hack: Lane Detection, Drift and Self Driving

I had the weekend to myself and so I decided to go back to being a kid and hack something that I would have done if I…

7 条评论

See all articles

Real World Modeling: Joint Estimation & Optimization

Srinath Sridhar

CEO & Co-Founder at regie.ai

Srinath Sridhar的更多文章

社区洞察

其他会员也浏览了

VALIDATING & TESTING

How Predictive Analytics, Big Data, Artificial Intelligence, and Machine Learning are Shaping How Construction does Business

The Intersection of Prompt Engineering and Business Strategy: Why It Matters for Consultants

Predictive modeling generates tangible value, and we can prove it

Model Magic: The Wizarding World of Predictive Models

Milestone Accomplishment: Transforming Supply Chain Complexity through Reverse Logistics Enhanced Modeling

5 steps to create an evidence-based, automated prioritization model with feedback loops

How to test linear regression models

What is Quantitative modeling: Definition, tools and users

Running cost of Llama 3.1 70b Instruct

Srinath Sridhar的更多文章

Writing Papers at Startup: My Work at Onera

New Frontiers in AI: The Next Generation of Enterprise Text Editors

Corona Virus 7-Day Forecast & An Opportunity to do some Algebra

Computer Simulations to Show the Impact of Social Distancing in Spreading the Coronavirus

A Simple, Elegant Mathematical Structure at the Heart of Fulfillment Supply Chain

Streaming Machine Learning

What do the Apollo missions, drones and Onera have in common?

Fast Simple Solutions to the Knapsack Problem

The Four Software Engineering Personality Types

Weekend Hack: Lane Detection, Drift and Self Driving

社区洞察

其他会员也浏览了

VALIDATING & TESTING

How Predictive Analytics, Big Data, Artificial Intelligence, and Machine Learning are Shaping How Construction does Business

The Intersection of Prompt Engineering and Business Strategy: Why It Matters for Consultants

Predictive modeling generates tangible value, and we can prove it

Model Magic: The Wizarding World of Predictive Models

Milestone Accomplishment: Transforming Supply Chain Complexity through Reverse Logistics Enhanced Modeling

5 steps to create an evidence-based, automated prioritization model with feedback loops

How to test linear regression models

What is Quantitative modeling: Definition, tools and users

Running cost of Llama 3.1 70b Instruct