The Case for Transparency: How to Open the Black Box of Financial Models

The Case for Transparency: How to Open the Black Box of Financial Models

This is a typical formula you find in Excel models. Obviously, it’s not easy to find out what’s going on here, right?


But what is it that makes it so hard to decipher?

Well, there’s a couple of things.

Let’s start with the variables. Every formula contains variables which are connected by mathematical operations or logical functions. The problem here is that these variables are represented as coordinates on a grid—like “Ops!J13” for column J and row 13 on the sheet “Ops.” These coordinates have no meaning as long as you haven’t figured out what variable resides in that spreadsheet cell. To do so, you first need to find that place. (There are some tools in Excel that make it easier for you to get there, but as a matter of fact you first need to go there to see what kind of number that cell contains.)

Once you're there, you’re not yet done. The cell itself contains just a number or formula but no information about its nature. In a well-organized model like this one I "borrowed" from F1F9, your might find the name and meaning of the variable somewhere to its left—but that’s far from being a universal convention. So in this case the variable is “Domestic charter landings.” Anyway, after finding out what the first of your variables means you have to repeat this endeavor multiple time until you’ve deciphered each individual variable that is mentioned in your formula.

To keep track of your findings you either have the store the meaning of each cell address in your head, or you make notes somewhere, maybe on a sheet of paper like I did for myself.

Admittedly, this is not quite an elegant solution, is it? Now, wouldn’t it be great if you could use the names of the variable right in your formula? With Analytica, you do exactly this. And here is how the exact same formula would look like if we’d simply replicate it:

It doesn’t quite cut through the complexity and reveal any structure yet, but at least you can tell the meaning of each variable when you read the formula. Let’s see if we can increase the transparency even more…

First, let’s introduce somethingthatmakesreadingeasieringeneral: spaces! In written text we quite naturally expect spaces between words, but that’s not alway been the case.

This is already an improvement but there’s still more we can do. We can use line breaks to organize and group parts of the formula that belong together. (By the way, this can also be done in Excel to make formulas easier to read.)

Obviously, the first section of the formula calculates the revenue from passenger landing tax by counting the passengers for the various Regions and Flight_types and multiplying them with the tax per passenger and an indexation factor. If so, why not separate this part of the calculation into a dedicated variable, say Pass_land_tax_rev1?

In Analytica this is extremely easy as we will see further down. But you can also break down complex formulas into a set of simple ones in Excel. (This is what best practice modeling standards like FAST or SMART suggest in order to enhance transparency of Excel models.)

We can basically replace every related part of the formula with a separate variable.

And, voilà!, what at first appeared as an impenetrable jungle of cryptic cell addresses now lies naked right in front of us: a fairly simply calculation previously disguised in complexity. As we can see now, there are three different types of revenue that are added up—Passenger landing tax (Pass_land_tax_rev1) as well as Domestic and International retail revenue (Demestic_retail_rev1 and Intern_retail_rev1). They are, then, multiplied with either 1 or o, depending on whether the model period (quarters in this case) ends between the acquisition date and the last forecast date. This is to eliminate numbers outside the forecast period of the model.

So, maybe we should even make a separate variable for the Forecast_period1 as well as the Display_units, things that shouldn’t be hard-coded in formulas anyway...

With this, the formula is reeeaaaaly easy to understand even if we don’t know any number.

But where has all the complexity gone?!? Did we really reduce the complex relationships between the various variables we had before? Nope, unfortunately you can’t reduce complexity. (And, by the way, complexity is not equal to “complicated”—but that’s another story…)

But what we can—and where Analytica can help to a great extent—is to make complex models much easier to comprehend.

So, here is how that looks like:

This is a so-called Influence Diagram. It shows the relationships between variables of different types (these are the different shapes and colors). As you can see, Revenues—the variable whose formula we’ve cleaned up just now—is the red-ish hexagon to the right. Arrows show that Revenues is influenced by (or: calculated of) various other variables. And you don’t just see the immediate “neighborhood” but also what’s behind its direct ancestors, and all the way down the rabbit hole…

Instead of several lines of Excel code you see the whole picture. I mean, a real picture that can be shown to and discussed with others instead of the one that's only in your head. This is how Analytica makes the actual model structure transparent—something that’s simply impossible with spreadsheets.

But it’s not just a picture. When you double-click on any of the objects in the diagram, you open a separate window where you can look at all the details behind, like Units, Description, Definition (the formula), the Values—when you hit “Calc”—, and even a list of all the Inputs.

Or, when you stay in the diagram window, you can select the object you’re interested in to reveal information in a panel at the bottom:

This way, you can easily navigate the model. And you can check every single intermediate result that feeds Revenues. This is hard to do when multiple calculation steps are jumbled into one cell or variable. It is almost impossible to track errors in the Excel formula we started with without purposely dissecting it. But then, why not laying out complex calculations as a series of simple ones in the first place?

Another benefit of an influence diagram is that you can easily see what the assumptions (inputs) or “drivers” of results are. A common notation in Analytica is to use green rectangles for this typ of variables. As you can see, in our simple revenue model there are lots of inputs that drive the result. This kind of clarity would simply not be possible in an Excel spreadsheet.

But there is still more in an Analytica model. In Excel you have to put formulas into every cell for a variable that has one or more dimensions. In our example, the dimension is the time periods. Consequently, the Excel formula is repeated in each cell along the time axis. This makes it laborious and error-prone to write, update, or check formulas in large spreadsheets.

In Analytica, though, formulas apply not to “cells” but to entire tables—which are, by the way, not restricted to two dimensions. Thus, any object in a model can be a multidimensional table or “cube.” In our example, the number of passengers is broken down by Region and Flight type. That’s why, in Excel’s logic, we have to calculate the values for every instance separately:


This Analytica formula is deliberately clumsy in order to reflect how Excel handles multidimensional calculations. Obviously, there is one “line” in the spreadsheet for Landings where Region is “Domestic” and Flight type is “Charter” (Landings[ Region = ‘Domestic’, Flight_type = ‘Charter’ ]). And there is also a corresponding line for Passengers per landing. Multiplying these two gives the number of Passengers for “Domestic” and “Charter.” Then you have to repeat this for every other combination before you can calculate the sum for Passengers.

In Analytica, “Landings” and “Passengers per landing” are in fact 3-dimensional tables, indexed by Model_period, Region, and Flight_type:

Since Analytica is aware of the dimensionality of each variable, you—as the modeler—don’t need to take care of it and write over-complicated formulas. Instead, when multiplying Landings by Passengers per landing, the formula is as simple as: Landings * Pax_per_landing.

Notice the Inputs listed here are displayed with their dimensions—Model_period, Region, and Flight_type. The result for Passengers is therefore also indexed by these dimensions. But the formula (the “Definition”) doesn’t need to mention any dimension at all—which makes it much easier to write, verify, and debug formulas. This feature of Analytica is called “Intelligent Arrays,” and it is—together with the graphical depiction of models as diagrams—what makes the software so powerful, flexible, and transparent.

Now, back to where we started: What is it that makes it so hard to decipher Excel models?

  1. Excel formulas refer to cell addresses rather than named variables. To find out the meaning, you must hunt around the spreadsheet.
  2. The model structure is not visible. Sure, you can turn on dependency arrows using the Detective tool, but you never see the whole picture.
  3. Formulas deal with cells, not entire tables or multidimensional variables. You have to repeat formulas many times which makes things unnecessarily complicated, hard to change, and error-prone. 

In the end it all boils down to this: The interface of the spreadsheet is focused on displaying numbers, not showing the structure of the model. The structure is hidden in individual cells. Analytica literally breaks up the cell and allows to display the relationships between variables in a calculation using a graphical, interactive user interface. In doing so, it doesn’t reduce the complexity of a model but makes it more approachable and easier to understand.

Scott Evans

Finance & Technology Integration | Wealth Management | Unlock Tech-Driven Growth

9 年

This is a great insight into the power of Analytica... thanks for the effort to pull this together... :)

Torsten R?hner

Financial Modeling, Driver-based Planning, Decision & Risk Analysis

9 年

I just updated the post with some links to our new Analytica website https://beyondexcelmodeling.com

Patrick O'Beirne

Available for data analysis for decision insight, accelerate data processing, data quality, professional Excel development, VBA automation, Survey APIs, Power Query, testing spreadsheets, review & controls audit.

9 年

Thanks Torsten, it's a field that is hard to keep going in, but people still invent new languages. Is Quantrix Modeller still used? Paul's point is very relevant - having an appropriate domain language makes the models so much more expressive and readable than spreadsheet formulas.

Patrick O'Beirne

Available for data analysis for decision insight, accelerate data processing, data quality, professional Excel development, VBA automation, Survey APIs, Power Query, testing spreadsheets, review & controls audit.

9 年

It would be great if all models were sufficiently regular to allow that kind of restructuring :-) I remember the Spreadsheet Detective addin also created annotations which documented formulas using labels it found to the left or above cells. Again, that assumes that such labels exist. I'm not clear - does Analytica annotate Excel formulas like SD does, or do you have to rewrite and re-create everything, like the other modelling languages we used to know like Javelin, ModelMaster, Brixx, etc?

回复
Paul Buckland

Partner, Deep Analytics Resources and Tools at The Lantau Group

9 年

Analytica is a joy to work with. We replaced a huge, intractable Excel model with it. The model went from countless Excel files to a single Analytica file, but the interesting part is that it allowed the model to become more complicated than before, since the structure was easier to understand due to the influence diagrams, and the code was easier than before due to the flexibility of the language and intelligent arrays. Highly recommended!

要查看或添加评论,请登录

Torsten R?hner的更多文章

社区洞察

其他会员也浏览了