Computation - Mathematical Programming Model Builder
Minh Nguy?n
A combination of Human, Husband, Father, Son, Brother, Logistics Enthusiast, Supply Chain Problem Solver and Finance Operation Manager I Ex PwC I Ex UPS
Model Builder
The Mathematical Programming Model Builder add-in builds models for linear and integer programming. They can be solved by either the Excel Solver add-in or the Jensen LP/IP solver that is part of this collection. The add-in provides four different model formats that are described in the pages of this section. One of these formats is similar to the model created by the Linear/Integer command of the Math Programming add-in. The Math Programming add-in has so many features that it has grown very large in memory requirements. Since the MP Model Builder add-in is limited to only LP and IP forms, it is much smaller than the Math Programming add-in. It also has new features that may be useful.
Two of the new formats use lists rather than matrices to store the constraint coefficients. For cases with sparse coefficient matrices (many zeros and only a few non-zero entries), storing the list of nonzero coefficients is more efficient than storing the matrix. Another justification for the list model is the difficulty of constructing and correcting a matrix model that has more than just a few rows and columns. I envision creating an add-in that builds models for specific problem categories. This automatic model construction will be simpler with list formats.
When the MP Model Builder add-in installed, the new control commands are placed on the ORMM menu. Note that the LP/IP Solver is also installed. That program is used to solve the models for this section. The Excel Solver can also be used to obtain solutions.
The principal command on the menu is Linear/Integer. All models begin with this command as illustrated on the next page.
A model worksheet includes buttons that change the model or call the solution algorithms. An error dialog concerning missing links will be presented when a file is opened on a computer different than the computer that created the model. It is best to cancel this dialog. Then choose the Add Buttons command to replace all the buttons on math programming models in the workbook. Before saving a workbook that will be transferred to another computer, choose the Remove Buttons command. This removes all the buttons in the workbook so a link message does not appear when the model is opened on a different computer. The buttons can subsequently be added with the Add Buttons command. The demonstration workbook for this add-in has its buttons removed, so the first thing to do is to select the add buttons command.
To solve models, one of the solver add-ins must be installed. The MP Model Builder add-in builds models, but it does not solve them. The solver add-ins can solve models, but cannot build them. For these examples we have installed the LP/IP Solver add-in. Be sure to use the latest version of the add-in.
Formats
The general linear programming model has a linear objective function, linear constraints and lower and upper bounds for the variables. The model has n variables and m constraints. Since the expressions for the objective function and constraints are linear, the model is completely described by the coefficients of the objective, the coefficients of the constraints, bounds for constraints, and lower and upper bounds for the variables. The formats available with the MP model builder differ on how this information is stored.
This page shows the model formats available with this add-in. The add-in only creates math Programming models for LP and IP. The tableau format is similar to the LP model created by the Math Programming add-in. The others are new. The following pages expand on the different formats.
Building a Model
To create a new model choose the Linear/Integer command from the ORMM menu.
The menu has three additional commands. The worksheet holds buttons to call various add-in procedures. These buttons are linked to the computer that creates a model, and a link error is generated when the worksheet is opened on a different computer. Choose the Add Buttons command to erase the buttons and create new ones. When you expect the worksheet to be used on someone else computer, choose the Remove Buttons command before saving. The About Add-in gives the release date of the version of the add-in that you are using. Make sure you are using the latest release. The latest release date is on the Excel page for this add-in.
The Choose Format dialog lists the formats are available and provides a field for specifying the problem name. The name entry is used to provide Excel names to many ranges on the worksheet. The name must satisfy Excel's restrictions for naming ranges, that is: the names must not contain spaces, they must start with a letter and they may not include punctuation marks. If an error occurs when this dialog closes, try a different name for the problem. The program automatically suggests names like LP_1, LP_2 and so on. The user may prefer a more descriptive name. Once it is specified the name cannot be changed.
We choose the Tableau option with the name LP_1.
The LP/IP Model dialog appears after the Choose Format dialog is dismissed. The dialog accepts the dimensions of the model as well as data regarding the direction and measure for the objective, the integer variables, the Solver selection, and the choice for the Sensitivity Analysis when the problem is solved.
The Random Data button fills the model with random values. The Random Number Seed determines the values assigned, so the same problem can be assigned to different formats. The Mac and Windows versions of Excel give different models for the same seed.
The entry in the Coefficients section indicates the number of coefficients to be entered. For the tableau format, coefficients are specified for all rows and columns, so the number of coefficients is automatically specified as the product of the number of variables and the number of constraints.
The checkboxes for upper and lower limits specify whether vectors are provided to hold these limits. If unspecified, lower limits are zero and upper limits are large numbers. Checking the Variable Type box includes an array that specifies whether each variable is Real or Integer. If integer variables are included in the model definition, this array is included automatically. When checked the Dual Solution checkbox creates arrays for the dual solution in both the variables and constraints regions of the worksheet.
All of these entries except the name and the feature choices can be changed with the Change button after the worksheet is constructed.
In the examples of this section the figures show the dual variables associated with the solutions. These results are interesting from a theoretical viewpoint and often have relevance in practical problems. For those interested in only the solution of the linear or integer model, it is not necessary to view or understand the dual variables.
Parameters and Buttons
Dismiss the dialog with an OK indication and the program builds a worksheet holding one of the formats described on this page. The formats all place buttons and parameters at the top of the worksheet. None of the values in this section are under direct control of the user and they should not be changed. The upper left corner holds the model name prefixed with the label "List Model". The yellow range in column A shows data used by the Jensen LP/IP Solver when that solver is active. It can be changed by selecting the LP/IP Solver item on the ORMM menu. When the Excel Solver add-in is active, the Excel Solver model is stored in this range. The range in column F describes model parameters. Column I holds information controlling the solver. Column L has solver output results. Column O is constructed when the the dual solution is displayed on the worksheet.
Cell F4 is important because it holds the objective function for the model while F3 holds the direction of optimization (Max or Min). Some of these entries in columns F and I can be changed by clicking the Change button.
The Change button opens the model dialog to change the size and other features of the model. The Solve button calls the selected solver. The Equation button rewrites the equations of the model. This is generally unnecessary but can be used to fix errors. The Change Relation button changes constraint relations (<=, >=, =). It is only effective if the selected range is on one or more relation cells.
The Tableau Format
The formats differ on how the components of the model are arranged on the worksheet. For the tableau format the variable information is arranged in rows at the top of the worksheet. The constraint information is arranged by columns at the left side of the worksheet. The constraint coefficients are stored in a matrix lying below the variables and to the right of the constraints.
The variable data is arranged by row. Row 10 holds variable names and rows 12 through 15 hold the type, lower bound, upper bound and objective coefficient terms. Row 11 holds the primal solution to the problem and is filled by the Jensen LP/IP Solver add-in or the Excel Solver add-in. We indicate with the green outlines the cells that are filled in by the solver. The numbers in rows 16 and 17 are the dual variables for the lower and upper bounds. In LP theory discussions they are called the Reduced Costs for the variables. For maximization problem the reduced costs are negative for variables at their lower bounds and positive for variables at their upper bounds. The dual values are computed by the add-in and surrounded by red outlines.
The constraint information: name, dual values, value, relation and bound are described in columns E, F, G, H and I respectively. The dual variables for constraints are usually called the shadow prices. It is filled by the Jensen LP/IP Solver add-in. The constraint data is arranged by columns.
The constraint coefficients are stored in a matrix with rows representing constraints and columns representing variables. This is an efficient representation when the constraint matrix is dense (most coefficients are not equal to zero). In many practical instances, constraint matrices are sparse where most coefficients are zero. The sample problem is small, but practical problems may have many hundreds (or thousands) of variables and many hundreds of constraints. Although the solution of very large problems is beyond our add-ins, it is not unlikely that problems may have a few hundreds of variables and several dozens of constraints. Then the matrix format is very inconvenient. The matrix is a vast expanse on the worksheet and the user may have a tough time entering and checking the data for even moderately sized problems.
Older versions of Excel are limited to 256 columns. Since the variables are arranged by columns, this automatically limits the model size to about 250 variables. This limitation is not so great for Excel 2007 and later versions because these allow many more columns.
The Dual Tableau Format
The dual tableau format is the transpose of the tableau format. Here constraints have the row orientation while variables have the column orientation. The constraint coefficient matrix as also transposed. For most problems, the number of variables is much greater than the number of constraints. When models are limited to 256 columns, the dual tableau representation is preferred for problems with more than 250 variables.
It is important to observe that this format represents the primal LP model and not the dual LP model. It is simply a reorganization of the primal data.
The Column List Format
The list format replaces the constraint coefficient matrix with a list. This format has the variables and constraints in arranged in columns. The constraint coefficients are stored in columns U through AA. Each coefficient has: index, column number, row number, name, coefficient value, the value of the variable associated with the column index, and the product of the coefficient variable and the coefficient value.
Several columns are yellow indicating that they contain formulas. The coefficient name, column X, is computed using a formula that combines column and row names. The coefficient variable, column Z, is computed by referencing the appropriate cell in the solution, column J. The coefficient value, column AA, is the product of the values in column Y and column Z. The numbers in the constraint value, column Q, are computed by summing the appropriate entries in column AA. The Excel function SUMIF is used for this calculation.
The advantage of this format is that it uses a fixed number of columns, independent of the problem size. The problem size is reflected in the number of rows used. The number of rows is very large for the current versions of Excel (more than 1,000,000).
The list format is much more efficient than the matrix for storing sparse constraint sets. It is unnecessary to store 0's in the list. This example doesn't illustrate this efficiency since the number of coefficients is the product of the numbers of variables and constraints. I expect that programs that automatically generate the data for the model will take advantage of the list structure. I plan to create an add-in that will automatically generate LP data in a manner to the DP Data add-in.
The current version of the LP/IP solver requires that the coefficient list be sorted by column order. The sorting is automatically accomplished when the Solve button is pressed.
The Row List Format
The row list format is the transpose of the column list format. All data is described by rows. The figure below shows part of the coefficient list since the full list of 50 columns would be difficult to display. For some problems the row list format may be convenient. The number of columns allowed by Excel may limit the use of this format since the number of nonzero coefficients will always exceed the number of constraints.
Tableau Format
Textbooks often use the tableau presentation because it is relatively compact. This format is illustrated below.
In the following we review each component of the tableau format.
Variables
The data and results for the variables appears at the top of the worksheet in a row format. We call it the row format because each row provides a vector of values of a particular type. For example row 10 holds the names of the variables and row 11 holds the values of the variables. The solution shown is the optimal primal solution.
The variables section provides all the data for each variable except the constraint coefficients. A variable has all its data in a column of the worksheet. For example column M holds the information for the first variable. Row 11 is outlined in green to indicate that the algorithms of the add-in fill this array. The names in row 10 and the data in rows 12 through 15 are filled by the user to represent the situation under consideration.
The Type data has two recognized values, Real and Integer. When all types are Real, the model is a linear program. If all the types are Integer, the model is an integer program. Problems with some real and some integer types are often called mixed linear integer programs. Only the first letter is used to identify the type, so i and r can replace integer and real.
Rows 16 and 17 are also filled by the computer. They show the dual solution associated with the variables. These numbers are sometimes called the reduced costs. The dual variables are included only when the Jensen LP/IP solver is used. The same results are found on the sensitivity worksheet created by either the Jensen LP/IP solver and the Excel solver. At optimality, the dual values for variables with a restrictive lower bound are negative (or zero). Variables with a restricted upper bound are positive (or zero).
Constraints
Information regarding the constraints start at row 22 and column D. We say the constraint data is column oriented because the data and results are shown by the columns of the display. The user can identify particular constraints with names in column E. The dual values (or shadow prices) are in column F. The left side of the constraints are evaluated using the solution variables and constraint coefficients and presented in column G. This range is yellow indicating that the cells hold formulas. The relationships (<=, =, >=) are in column H and the constraint bounds are in column I.
Coefficient Matrix
For the tableau format, the constraint coefficients are stored in a matrix. The figure below shows the constraint matrix for a randomly generated problem. The two dimensional matrix is convenient for problems small enough to display the entire matrix in a worksheet window. It becomes difficult to use when the problem has more than 50 variables and 20 constraints. For larger problems only a portion of the matrix appears in the worksheet window and it is sometimes difficult to find individual cells.
The matrix is not convenient if the number of nonzero coefficients is small compared to the number of cells in the matrix, n*m. This is called a sparse matrix. For older Excel versions, the number of columns on the worksheet, 256, restrict the size of the problem to around 220 variables. Excel 2010 for Windows and Excel 2011for Mac OS allow many more columns on the worksheet.
Dual Tableau Format
The dual tableau format is the transpose of the tableau format. It is useful because the number of constraints is usually much fewer than the number of variables. This is convenient for earlier versions of Excel where the number of columns is limiting. For later versions of Excel this is not really a problem because the available number of columns is very large.
Variables
Variable data is arranged in columns for this format. The number of rows is equal to the number of variables. The primal solution of the optimization problem is in column F and the dual solution is in columns K and L. The other columns hold the parameters.
Constraints
Constraint information is presented by rows.
Coefficient Matrix
The constraint coefficient matrix is stored under the constraints and to the right of the variables. The matrix is the transpose of the similar matrix for the tableau format.
Column List Format
The column list format shows all the model components as lists. For this format the lists are arranged as columns of worksheet cells. Rather than using a matrix, the column list for the constraint coefficients stores each coefficient with a row of a table. The advantage of this list format is that zero value coefficients need not be included in the list. The much reduced figure below illustrates the arrangements of the components on the worksheet. All components begin at the same row. Green outlined cells hold the primal solution of the model and are filled by add-in programs. Red outlined cells hold the dual solution of the model. Yellow cells hold formulas provided by the program. These should not be changed.
Variables
The variable columns start in column D. Since this arrangement is described earlier it is not repeated here.
Constraints
The constraint columns start in column N.
Coefficient List
The entries in the coefficient list refer to cells in the coefficient matrix of the tableau representation. For convenience that matrix is repeated below.
The coefficients are stored in a table starting in column U and continuing until column AA. The coefficient list for the example is below. Each coefficient occupies a row of the list. Columns V and W refer to the coefficient matrix of the tableau representation. Column V holds the index of a variable (column), and column W holds the index of a constraint (row). Together they define a cell of the coefficient matrix. The first column entry (row 13) indicates that the coefficient for this row is the coefficient for variable 1 (X_01) in constraint 1 (A_1). The names of the variable and constraint are concatenated in column X (X_01-A_1) to help identify the combination. The value (10) of the coefficient is in column Y. This particular model has 50 constraint coefficients, so the list has 50 entries. Reading down the table, you can see that the first column of the coefficient matrix is the first five entries in the table. During the solution process this list is always sorted with column V the primary sorting index and column W the secondary sorting index.
The last two columns, Z and AA, are computed with formulas. Column Z uses an INDEX function to recover the value of the variable for each row of the table. Since the first five entries are for X_01, column Z reports 0 for the value of that variable. This is obtained from column D of the variable table shown at the top of this page. Column AA holds the product of column Y and column Z. For each row, this is the contribution of the term to the constraint equation.
The constraint value shown as column Q in the constraint display is the sum of all cells with a specified constraint index. For example the value of P13 is the sum of cells AA13, AA18, AA23, and so on through the coefficient list. Column Q uses the SUMIF Excel function to compute the constraint values.
The constraint list display is certainly not the most efficient for a dense coefficient matrix as the one used for this random example. Its benefit is realized for sparse coefficient matrices where the number of nonzero coefficients is much fewer than the number of cells of the matrix.
This list will also be useful for automatically generated models that use computer programs to specify the coefficients of the model. I hope to create an add-in that will automatically generate large linear/integer programs much like the dynamic programming model generator described elsewhere on this site.
Row List Format
The row list format has the same components as the column list format except information is stored in rows rather than columns. The example is shown below. It is not clear that this format has value not provided by the column list format, however, it does seem to be easier to review problems in the row format than in the column format.
I provide the row list format to complete the collection of possible formats.
Transfer/Change
The transfer feature changes a model with one format into another. When you select the Linear/Integer Model command when a worksheet holding an LP/IP model is active, the Transfer Data checkbox is enabled on the format dialog. Click that box and choose one of the format options. Then the data from the active worksheet is transferred to a new model with the requested format.
The example shows model LP_1 active. It has the Tableau format. Choosing the Linear/Integer Model command and clicking the tranfer button will create a new model, LP_5, with the Dual Tableau format. The data for LP_5 will be transferred from the LP_1 model.
Change
Clicking on the Change button presents the model dialog. The features of the model that cannot change are grayed. For the example the modification prescribes one additional variable and one additional constraint. We also require a mixed model with the first five variables typed as integer.
The changed model is below. New variables are added to the right of the variable list, and new constraints are added to the bottom of the constraint list. The constraint matrix is expanded by one row and column. The problem has been solved with Excel Solver. The new features are not effective since the default coefficients are zero.
Source: Paul A. Jensen, Texas University