Power BI Intermediate Level: 52 - Importing and Appending a Folder of Files

Power BI Intermediate Level: 52 - Importing and Appending a Folder of Files

Table of Contents | Power BI Report File | Sample Input

Short summary: Power BI allows you to import and append a whole folder of files, either stored locally or online with OneDrive, Teams, or SharePoint. You should create a query for a single file, convert that query to a function, and then use the Folder Connector and apply the function to each file in the folder, then append the tables. This works best if the files have the same structure of column names and data types.

Let's shift our focus back to Power Query and learn how to import a folder of files all at once and append the resulting data. This only works if the files have the same structure of column names, column data types, sheet names, etc.. If the structure is just slightly different, you can try to account for the differences with the function which we are going to use. For example, if the table starts at different row numbers depending on the file but is otherwise identical, you can try to dynamically remove the excess top rows, I will cover that in the advanced section of this article series.. If the files have totally different structures, this technique is not applicable and you should instead import them one by one.

For this article, let’s keep it basic and assume that all files have the same structure. The process which we will use goes as follows:

  1. Create a query for one file, which is representative of all files which you want to process
  2. Change the query into a function
  3. Use the File connector for local files or the SharePoint folder connector for files on OneDrive, Teams, or SharePoint
  4. Apply the function from step 2 to each file
  5. Append the processed data from each file

Let's walk through this step-by-step.

1. Create a Query for One file

Our sample data consists of three csv files. There is also a Readme.txt file which we want to ignore. Create a query to one of the csv files, which should be representative of all files which you want to process. In this case, we only do the automatic steps of loading the table, promoting headers, changing the column data types.

This is the query to one of the store files.

2. Change the Query into a Function

We haven't done coding in Power Query before, but we will do a very simple adjustment here. Don't worry, you don't need to understand everything here, I will explain more about M code in the advanced section of this article series.

For now, open the advanced editor.

Open the advanced editor for the query.

The M code looks like this:

The unprocessed M code.

Now we want to turn the query into a function and replace File.Contents("") with a function parameter. Let's make the following changes:

  • Add the following above the "let" statement: (file_contents_parameter) =>This means that our query become a function with an input parameter file_contents_parameter
  • Comment out the Source step using //
  • Copy and paste the Source step and replace File.Contents("") with our parameter

The M code now looks like this:

The M code with our changes.

When closing the advanced editor, you see that the query has turned into a function. We don't see individual steps anymore, it's just the function code.

After our code changes, the query has turned into a function.

So whenever you need to change or debug the function, temporarily change it back into a query by doing the following:

  • Change the function back into a query: Comment out the top row (file_contents_parameter) => and the new Source step. Uncomment the old Source step
  • Perform the changes or do the debugging
  • Change the query into a function again

It's a bit of a hassle but in the long-run it is the cleanest and easiest way.

3. Use the File Connector

Let's now do the import of a folder of files. We will first cover the case of local files. Note that online refresh won't work here unless you set up a Data Gateway on your local computer for Power BI online to access the files.

In the Power BI Home ribbon, go to Get data, and click More. Then select Folder and click Connect.

Under Get Data, use the folder connector.

Select the folder on your Computer and proceed. When you get to the screen below, click Transform Data so we can immediately add more steps.

Click Transform Data to add more steps.

What you see here are the files of the folder and all subfolders represented in a table. We first want to filter out the files which we don’t want to import. For example, we don’t want to import the Readme.txt file. In general, here is my recommended way to make sure that only the desired files get imported:

  • Transform the (file) Extension column to lower letters in case people mix capitalization as in “.csv” and “.CSV”. Go to the Transform ribbon, Format, and select lowercase.
  • Filter the Extension column to “.csv” or other desired filetype
  • Optional: Filter for or filter out specific folder paths if relevant to you
  • Optional: Filter for file Name patterns

4. Apply our Function to each File

That leaves us with a table representation of the files. The column Content contains the file content in binary form, meaning zeros and ones. You could click the double down arrow inside the Content column header for Power BI to attempt to create a function by itself to extract the content. However, in my experience, that usually ends in a big mess and is hard to maintain. Instead we will apply our function from step 3. Go to the Add Column ribbon and select Invoke Custom Function.

Click Invoke Custom Function.

Choose our function and select the column Content as for our parameter. This means that for every row, our function will be applied to the content of the corresponding file.

Set our function and use column

5. Append the Data

Under the Processed file column, you can click inside the whitespace in one of the cells to see a preview of the table data. Everything looks well. Now click the arrows icon to the right of the column header to expand and append the data.

Expand and append the data.

Now we have the data from all three files appended. And this could scale to thousands of files, if required. Also note that we can keep the metadata of the files. In this case, I kept the file name of the source data file as a column. This can be quite useful for knowing where the data came from.

This is the appended result.

Optional: Using the SharePoint Folder Connector in Step 3

Do you want to import a folder from OneDrive/Teams/SharePoint so you can refresh your Power BI report online? We established in an earlier article that both OneDrive and Teams are built on SharePoint. You can set file synching with OneDrive or Teams and your local computer and have Power BI import the files which get stored online. That way online refresh will work just fine.

Important: You can only connect to the top-level of the SharePoint documents, not directly to a specific file.

First, find out the top-level of your SharePoint link. One way to do that is to open one of the CSV files in Excel, going to File, Info, and clicking Copy path (not Copy local path). Copy the part before “/Documents”.

Get the SharePoint URL, for example using Excel's

For example, if Excel’s Copy path gives you: https://XYZ.sharepoint.com/personal/ABC/Documents/Store%202.xlsx?web=1

You should copy everything before /Documents: https://XYZ.sharepoint.com/personal/ABC/

In Power BI, go to Get Data, File, SharePoint folder, insert the URL, and click Connect.

Insert the URL and click

Log in using Organizational account if available, else using Microsoft account. Once connected, click Transform Data.

Then you will see all files on the SharePoint.

The connector will at first list all files on the SharePoint.

Click Transform Data. You will need to filter the table of files to just your files of interest. From there you can proceed as shown earlier in the article. So this was a longer one but it’s really worth it if you need it.

Conclusion

Power BI allows you to import and append data from a folder of files, either locally or using OneDrive, Teams, or SharePoint. This can scale to thousands of files. An important requirement is that the files have the same structures, including column names and data types. In my opinion, the cleanest and easiest way to process the files is to create a query for one representative file, apply your changes, and then change the query into a function. You can then use the function and apply it to each file in the folder. If you want to change or debug your function, you can temporarily change it back into a regular query.

Next up, I will show you a similar technique for importing multiple sheets from an Excel file at once.

Please like, share, and subscribe and feel free to ask questions in the comments below.

Next article: Importing and Appending Multiple Excel Workbook Sheet


要查看或添加评论,请登录

Richard Steinbiss的更多文章

社区洞察

其他会员也浏览了