Advanced editor and Power Query: keep your code clean!
Hello!
Hope you are doing well!
In today's article I would like to talk about documenting your work within Power BI and quality of life improvements. From my point of view common situation in IT projects is like:
"Documentation, everyone needs it, nobody likes to write it."
Subjects I'll write about:
Folders for your queries
As we all know, Power BI works better with star or snowflake schema. So we need facts and dimensions. Let's take a look at screen below with a couple of tables loaded in advanced editor:
Luckily for me, there is a prefix saying if it's fact or a dimension table.
What in case we would have 20-30 tables? Plus subtables for append/merge purpose. Then we would have quite a long list of queries!
Answer to this might be creating folders.
Simply, right click on query -> move to group -> new group:
Additionaly, within group itself you can drop a few words of description:
After few clicks we can have such view:
One downsize of this is that "other queries" group is always automatically created and you cannot remove it.
Of course, same as folders on desktop, you might have hierarchy of groups. So it is easier to manage. I can remove "fact"/"dim" prefix in case I don't want to keep it. With folders it's still easy to read and understand.
If you upload changes, within Power BI desktop "data" panel you will see just tables. So the grouping is good for load and transform work in back-end of the report.
Parameters
Parameters are specific objects within Power BI that can help you on Power Query level to automate work, for example:
To create parameter, simply right click on query pane (or the group where you want to store it). Based on info within folders for your queries, I already created separate group for parameters.
Once you click new parameter, then you have a lot of options to choose from:
Let's now look on our queries. Every query is a connection to excel table within the same folder, the same excel. So whole path is common.
As you can see, in excel connector (and many others) path is a text value. So our parameter "p_folder_path" also needs to be a text type. Here's the parameter I'll be using to automate queries:
Type is changed to "text" and current value is the excatly same path I was showing in above screenshot.
For each query, I can swap source text value with the parameter. So in case I would have 20 queries and suddenly source path is changing, I only need to update parameter. You can also change parameter value from Power BI Service API.
Common case is that usually you import data from multiple excel files. So it's just a matter of editing text within parameters and you can have something like:
In few words: you are joining different strings of text into one. Just remember to update "current value" within parameters accordingly. Make sure that parameters combined, create valid path to folder and excel file.
Commenting Power Query steps
I find ability to comment in any programming language very powerful. This is the place where we, simple developers can put business context. Why we did certain things in the code. Or why we used this specific workaround? Maybe there're some specific cases we need to take under consideration?
In M Query, if you put /* and then */ everything between these 2 marks is ommited when code is compiling. Alternatively you can use // for quick comments within code.
Example of query inroduction plus few comments - definitely would help in future developing or troubleshooting:
Another approach I like to take is to split query steps between data load and data transformations. So instead of this within query:
We can have something like that:
I believe it provides cleaner way to look at applied steps.
领英推荐
M Query itself looks like that:
You just need to remember about M Query rules. Usually each step refers to step above. But this is topic for the whole separate article.
Important: If you need query folding, then instead of doing #"===SET SOURCE===" = 1 you could rename the 1st step itself. So we can have clear information in applied steps and query folding at work:
Additional note: when I write M Query manually, I usually put comma at the start of line. This specific code was generated automatically, so comma is always at the end of line. The more M Query you will write on your own, the more differences you will start to notice.
Naming convention for Power Query steps
Last item I would like to share is to keep your applied steps clean. Based on this I cannot really tell what's going on. Unless I check details behind the steps.
For instance "filtered rows"? What is filtered? Where and why?
Instead we could write something like:
This way we have some understanding of what's going on. If we need more input, for instance we want to know excatly how the date column was filtered, we can add extra comment ("i" icon on the right). So with hover over we can check the note:
And this is how it looks in M Query:
As I mentioned, M Query doesn't like spaces. So if we are referring to above step, we need #"step name" syntax. Another solution might be to use underline sign:
This way, you can refer to certain step faster, if you are writing M Query on your own. Usually I also use prefix of "s_" so M Query intellisense gives me only list of steps to choose from, instead of all methods, functions and other fancy items.
Overall, this is how M Query code should look for our example of Sales table. Of course on top we have (lines 1-7) we have overall query documentation, but the whole window didn't fit in screen capture.
That's all for today!
Let me know if you have any questions or comments!
Hope you have a great week ahead!
Programmer / Report Writer / Advancement data professional at Stockton University
1 年Great post. I have done some of the things you talk about in the power query I create. Thanks to you I have a few more good tips. ?? Thanks for sharing.??