Fun with Graphing in Power BI - Part 3i
Update: For the next article in this series, click here .
Update: Download the PBIX File
At the suggestion of Imke Feldmann , who is an unmitigated M genius and you should most definitely read her blog , I am posting a link to download the PBIX file(s) used in this series. Here is the link to download Fractal.pbix .
Introduction
Perhaps you thought there were no more fun graphing topics with Power BI after Part 1 and Part SQRT(4) . Well, you would be wrong. You see, because they hate you, mathematicians treacherously strive to invent new ways to confound and befuddle actual human beings nearly every day. In 1975, one particularly vicious mathematician, Benoit B Mandelbrot used hundreds of years of research to create perhaps one of the most devious, spiteful and sadistic mathematical constructs...the fractal.
Incidentally, the B in Benoit B Mandelbrot stands for Benoit B Mandelbrot. That's fractal humor. You might get that joke by the end of this article.
Background
The fractal is a particularly cunning and malicious trick by mathematicians. The reason is that when mathematicians present the fractal, they generally present the deceptively simple equation from the canonical Mandelbrot Set:
This formula would then be followed by the fateful words "The answer is trivial and is left as an exercise for the reader". Many a trusting and innocent soul has been fooled into thinking "Ah, these fractal, how hard can they be?" Little do they know...
You see, the first thing about the above equation is that the letter c represents what is known as a complex number. Complex numbers have a "real" component and an "imaginary" component denoted by i. Truly, I am not making this up. I swear. So, a complex number takes the form of something like:
5 + 3i
So, a real five and an imaginary 3. Remember, imaginary numbers are a real thing...not like real, real because then they wouldn't be imaginary but rather, you know, real, so...you know... Incidentally, just to pile it on a little, zero is actually considered both a real and an imaginary number. Booooom! Mind blown!
The second thing that is often left out is that what you are computing essentially ends up being a color. You see, the complex number designates something that you can plot on the imaginary plane, but when we do this in the real world, we just use x and y. y becomes the number for the "imaginary" axis. So, we already know the coordinates we are plotting, to distinguish between different "types" of coordinates, we use color.
Finally, while we ultimately represent the result of the equation as a color, what we are really computing is the number of iterations or recursive passes performed before the specified coordinates on the imaginary plane "escape" the Mandelbrot Set. You see, when using the above formula to compute z, some coordinates tend to have z increase towards infinity as you iterate while others never do. Those that increase towards infinity "escape" the Mandelbrot Set while the others stay within the Mandelbrot Set.
See, all of this missing information essentially adds up to lying, which mathematicians are want to do. It's kind of like when you tell a girl that you drive a 'vette, but leave out the "Che" prefix and just let her assume the prefix was "Cor". Kind of like that. Not that I've ever done that. Well, OK, not that particular lie...I mean, I lie to women all the time...let's just be clear.
The Data Model
So anyway, did I just hear me correctly? Was I was talking about needing to iterate or do recursion in order to calculate the Mandelbrot Set? That can't be good. We all know how that crazy recursion thing worked out with the whole Runge-Kutta debacle ... I Guess DAX is out of the question. Thus, we will need to solve this in "M", the Power Query coding language.
But first, let's setup the model. And we will start really simple. Let's just imagine a tic tac toe board. Nine squares. We will assign our squares with real and imaginary coordinates this way:
x y -2 2 -2 0 -2 -2 0 2 0 0 0 -2 2 2 2 0 2 -2
I've pasted it this way for sake of scrolling. So, two columns x and y with those pairs of coordinates. You can do this in Excel and then just connect to that data in the Query Editor. You should now have a query that displays two columns, x and y and nine rows of data. Simple.
The Code
Alright, now what we need to do in Power Query is to create a new Blank query. Name this query "fnFractal2". My first attempt at this, fnFractal, was a miserable failure. Open up the Query Editor and paste in the following code:
let
???fnFractal2 = (x,y,limit,counter,colors,xorig,yorig) =>??????
???let
???????a2 = x*x,
???????b2 = y*y,
???????c = a2 + b2,
???????limiter = limit*limit,
???????z2r = x*x - y*y,
???????z2i = 2*x*y,
???????xNew = z2r + xorig,
???????yNew = z2i + yorig,
???????Return = if c >= limiter or counter >= colors then counter else fnFractal2(xNew,yNew,limit,counter+1,colors,xorig,yorig)
???in
???????Return
in
???fnFractal2
OK, that's a lot of code, and a lot of parameters! Now, I'm not going to go into exactly how the calculation works. If you are interested, here is a good article that explains how to do it by hand . What I am going to do is point out a few things. First, the "limit" parameter is there to help speed up the calculations and determine when a particular set of coordinates "escape" the Mandelbrot set. I won't go into it, but mathematicians tend to use 2. In other words, once the calculated z value exceeds or is equal to 2, or if you are computing x squared plus y squared, 2 squared, which is 4. Again, if you want the explanation, click the article link above.
So, once we have done all of our calculations, we check if the computed value is equal to or greater than our "limit" or whether we have exceeded what I have called the "color" threshold. That threshold is there to ensure that we don't do calculations to infinity. Remember, some coordinates never escape the Mandelbrot Set. So, if we didn't have that there, we would essentially enter an infinite loop. Not essentially actually, we would, in point of fact, enter an infinite loop. So, if either of those conditions are met, we exit the function. Otherwise, we recursively call our function and compute the next iteration.
Now, this function has a lot of parameters. So let's solve that. Create another Blank query. Name this one fnFractal2Init. Go into Advanced Editor and paste in the following code:
let
???fnFractal2Init = (x,y,limit,colors) =>
???let
领英推荐
???????Return = fnFractal2(x,y,limit,0,colors,x,y)
???in
???????Return
in
???fnFractal2Init
OK, four parameters are better than seven. Now we can finally use our new fractal formulating functions. In your little 9 row table, create a new column called "Color" like so:
=fnFractal2Init([x], [y], 2, 3)
You should get all zeros returned except for coordinate 0,0 which should be 3. Check and make sure that your new "Color" column is set to type of "Whole Number". Save and load the data from the Query Editor and in Power BI create a matrix visualization. Put x in the X axis and turn off summarization, y in the Y axis and turn off summarization and Color in Color saturation. Set your maximum value to that weird default salmon color and your minimum value to that wierd default cyan color and you should get something like this:
OK, I hear what you are saying, not at all impressive what-so-ever. Hey, what do you expect for nine measly data points? A ham sandwich?
The Solution
Fair enough though, what we need is more data. Let's take our tic tac toe grid up a notch. Let's do a grid with 81 cells in it that range from 2 to -2 in both x and y in increments of .5.
x y 0 0 0 0.5 0 1 0 1.5 0 2 0 -0.5 0 -1 0 -1.5 0 -2 0.5 0 0.5 0.5 0.5 1 0.5 1.5 0.5 2 0.5 -0.5 0.5 -1 0.5 -1.5 0.5 -2 1 0 1 0.5 1 1 1 1.5 1 2 1 -0.5 1 -1 1 -1.5 1 -2 1.5 0 1.5 0.5 1.5 1 1.5 1.5 1.5 2 1.5 -0.5 1.5 -1 1.5 -1.5 1.5 -2 2 0 2 0.5 2 1 2 1.5 2 2 2 -0.5 2 -1 2 -1.5 2 -2 -0.5 0 -0.5 0.5 -0.5 1 -0.5 1.5 -0.5 2 -0.5 -0.5 -0.5 -1 -0.5 -1.5 -0.5 -2 -1 0 -1 0.5 -1 1 -1 1.5 -1 2 -1 -0.5 -1 -1 -1 -1.5 -1 -2 -1.5 0 -1.5 0.5 -1.5 1 -1.5 1.5 -1.5 2 -1.5 -0.5 -1.5 -1 -1.5 -1.5 -1.5 -2 -2 0 -2 0.5 -2 1 -2 1.5 -2 2 -2 -0.5 -2 -1 -2 -1.5 -2 -2
Create the same column in this table, close and plot it. This time, set your data colors to divergent with your maximum that weird default salmon color, your minimum that weird default cyan color and your middle value that default bile looking yellow color. You will get something like this graphic if you squeeze it down really small:
OK, better, something is happening here. Now, go back to your Power Query table for this plot and add a second color column, Color2 with this slight variation:
=fnFractal2Init([x], [y], 2, 10)
Plot this one with the same settings and you should end up with something like this:
Disco! This is actually starting to look like the classic Mandelbrot Set, it at least has the same basic rough outline. Hmm, we need more data points. More specifically, we are going to need about 6,560 data points, give or take a few. I don't know about you, but I don't really want to type 6,560 pairs of coordinates into Excel. Luckily, you don't have to. Power Query actually allows us to generate all of these pairs automagically.
Create a Blank Query and paste in this little bad boy:
let
???Source = List.Generate(()=>0, each _ <= 6560, each _ + 1),
???#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
???#"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Index"}}),
???#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Index", Int64.Type}}),
???#"Added Custom" = Table.AddColumn(#"Changed Type", "x", each Number.IntegerDivide([Index],81)*.05 - 2),
???#"Added Custom1" = Table.AddColumn(#"Added Custom", "y", each if Number.Mod([Index],81) = 0 then -1 else -1 + .025 * Number.Mod([Index],81)),
???#"Invoked Custom Function" = Table.AddColumn(#"Added Custom1", "Color", each fnFractal2Init([x], [y], 2,270)),
???#"Filtered Rows" = Table.SelectRows(#"Invoked Custom Function", each true),
???#"Changed Type1" = Table.TransformColumnTypes(#"Filtered Rows",{{"x", type number}, {"y", type number}, {"Color", Int64.Type}}),
???#"Filtered Rows1" = Table.SelectRows(#"Changed Type1", each ([Color] > 10))
in
???#"Filtered Rows1"
This query uses some funky M code to generate a list of Index values, which we can then use along with some tricky math to auto-generate our x and y coordinates. You're welcome. Also, at the end, we filter out a bunch of our minimal values because we aren't going to plot those anyway because they are most definitely outside of the Mandelbrot Set. Add your custom Color column like this:
=fnFractal2Init([x], [y], 2,270)
Plot this one in the same way and you should end up with something like:
Which, dare I say, looks a hell of a lot like the classic Mandelbrot Set image. If you think that this image still looks blocky and coarse and unrefined, hold down the Alt key and then the Tab key on your keyboard and then look. Sweeeeeet.
Conclusion
Power BI can now add "fractal generator" to its feature list. I'm sure the product and marketing team will get right on that. While the result of this exercise is perhaps somewhat trivial in value, I think it demonstrates some advanced M coding techniques like custom functions, recursion and data/coordinate generation in a fun and interesting way. I hope you agree!
Oh, and one last thing. The Benoit B Mandelbrot joke. Images of the Mandelbrot Set exhibit an elaborate and infinitely complicated boundary that reveals progressively ever-finer recursive detail at increasing magnifications. In other words, if you "zoomed in" to an image of the Mandelbrot Set, you would see the same image above repeated over and over again, infinitely. Hence, the B in Benoit B Mandelbrot standing for Benoit B Mandelbrot repeats the same pattern endlessly...
Final Conclusion
Math jokes are lame.