Outputting Tables via Outlook in Microsoft Fabric Pipelines: A Step-by-Step Guide
INFuse Data Solutions
Data Analytics & Migration Specialists. Trusted data partner for leading UK organisations.
Pipeline runs often need to provide some form of tabular feedback, that could, for example be in an ETL run where error handling requires human input, or where summary data needs to be delivered on to a sales director at the end of every month. As much as I enjoy Microsoft Fabric, I was hoping this was going to be as easy as adding a script activity output to an Outlook activity in Data Factory but after an hour or two of messing around with pipeline expressions (which aren’t my friend) it was clear another solution was required. The good news is that there is a range of solutions so you’re likely to find something that works, and we’ll cover all of those later in the post, but the one that was suitable in my use case and today’s focus is using a notebook to query a table, wrap that in HTML, output that on notebook exit, and pop it in the body of an email.
In this blog post, we'll walk through a script that captures validation errors from a data pipeline and presents them in a neatly formatted HTML table. If you don’t fancy reading through every step today, then the good news is that you can copy and reuse the code here by just changing the SQL query to your own.
Overview
The overall idea is that we will have a pipeline which runs a notebook. When that notebook runs the Lakehouse is queried and a DataFrame made. In the notebook we then use a python library to wrap that table into HTML which is then passed to an Outlook activity on exit. We have all sorts of options for display customisation, so it makes for a flexible approach.
In the notebook we have one cell which does the following:
1. Query the error details from a database using Spark SQL
2. Convert the query results into a Pandas DataFrame
3. Transform the DataFrame into an HTML table with custom styling
4. Output the HTML content using the notebook exit function
Let's dive into each step to understand how it works.
Prerequisites
Before you begin, make sure you have:
- Basic understanding of HTML & Python and SQL.
- Familiar with Fabric Notebooks
- It’s helpful to have an LLM like Copilot on hand to help quickly adjust your html or python script. Honestly, it’s okay to do this rather than spend 8 hours relearning HTML, use the advanced tools available to you.
Okay let’s dig into the actual steps on how to do this:
?Step 1: Import Necessary Libraries
import pandas as pd
We start by importing the Pandas library for data
Step 2: Define the SQL Query
query = """
SELECT
e.RunNumber,
e.FileName,
DATE_TRUNC('second', e.ErrorTimestamp) AS ErrorTimestamp,
e.ErrorType,
e.ErrorMessage
FROM
SilverLakehouse.error_event_detail e
WHERE
e.RunNumber IN (
SELECT RunNumber
FROM SilverLakehouse.load_audit_history
WHERE BatchID = (SELECT MAX(BatchID) FROM SilverLakehouse.load_audit_history)
)
ORDER BY
e.RunNumber DESC
"""
This particular SQL query we’ve mocked up retrieves error details from the SilverLakehouse.error_event_detail table, filtering for errors that occurred in the latest batch run for an ETL pipeline. You can replace this with your own query.
Step 3: Execute the Query and Convert to Pandas DataFrame
df = spark.sql(query).toPandas()
We execute the SQL query using Spark's sql method and convert the resulting Spark DataFrame to a Pandas DataFrame using toPandas(). This conversion allows us to leverage Pandas' data manipulation capabilities.
Step 4: Convert the DataFrame to an HTML Table
html_table = df.to_html(index=False, border=1, classes='dataframe')
We use Pandas' to_html() method to convert the DataFrame into an HTML table. The parameters are:
- index=False: Excludes the DataFrame index from the table.
- border=1: Adds a border around the table cells.
- classes='dataframe': Assigns a CSS class to the table for styling purposes.
Step 5: Clean Up the HTML Table
领英推荐
html_table = html_table.replace("
", "").replace(" ", "")
We remove unnecessary newline characters and extra spaces from the HTML string to ensure the HTML content is clean and compact. I had to play around with this a bit until things displayed correctly.
Step 6: Define the HTML Template with Custom Styling
html_template = f"""
<html>
<head>
<style>
body {{
font-family: Arial, sans-serif;
}}
table {{
border-collapse: collapse;
width: 100%;
font-size: 14px;
}}
th {{
border: 1px solid #dddddd;
background-color: #f2f2f2;
font-weight: bold;
padding: 8px;
text-align: left;
font-size: 16px;
}}
td {{
border: 1px solid #dddddd;
padding: 8px;
text-align: left;
font-size: 13px;
}}
h2 {{
font-size: 18px;
}}
</style>
</head>
<body>
<h2> Errors in Latest Data Load</h2>
{html_table}
</body>
</html>
"""
Here, we create a multi-line string (`html_template`) that includes:
- HTML Structure - Basic HTML tags like <html>, <head>, and <body>.
- CSS Styling - Custom styles to enhance the appearance of the table and text.
- Table Insertion - We embed the html_table string into the template using an f-string. An f-string is a way of inserting a string into something else which might change e.g. My name is X I am X years old.
Custom Styling Explained
Fonts: We set the font family to Arial.
Table: The border-collapse property ensures borders are shared between cells, and width: 100% makes the table span the full width of the container.
Headers (`th`): We apply a background color, padding, and larger font size for headers.
Data Cells (`td`): Similar styling with slightly smaller font size.
Heading (`h2`): Styles the main heading of the report.
Step 7: Clean Up the Final HTML Template
html_template_clean = html_template.replace("
", "").replace("
", "")
We remove any remaining unwanted characters from the HTML template to ensure it's a single continuous string.
Step 8: Output the HTML Using the Notebook Exit Function
from notebookutils import mssparkutils
mssparkutils.notebook.exit(html_template_clean
We import mssparkutils from notebookutils and use the notebook.exit() function to output the HTML content. This function terminates the notebook execution and returns the HTML string.
How It Fits into the Data Pipeline
With the output complete we can then use it in an email just by adding an outlook activity and using the notebook exit content in the body. Give it a few tests to make sure it works well for you but all being well this should be a working solution now!
Alternative Methods to Deliver Tabular Data via Email
If you’re just here for the script, then we’re done and good luck, but if you’re interested in how else this could be achieved then keep on reading.
So, beyond this solution there were quite a few others which could, to a greater or less degree, have achieved the objective of passing data to the users for error handling. Below, we'll explore these alternatives, their advantages, and limitations. We'll also discuss though why an email with a table is often the best solution.
I think it’s important to leave you with a real understanding of what this solution is compared to the alternatives so for each one I’ve detailed a few steps on how they can be implemented.
So why is an Email with a Table the Best Pipeline-Driven Solution?
After looking through the alternatives I’m still happy with how I chose to implement this in my last project, but it’s all down to context. I was largely stuck within Fabric, with no access to Azure, or Power Automate, and with a clear objective of retrieving tabular data via email.
That said I think there are a few reasons why this solution stands out:
Conclusion
I hope this post has been helpful, it’s been a big one to put together but also a lot of fun learning and testing a new solution which I hadn’t seen before. Feel free to take and reuse this code to deliver some data to whoever needs it, or to follow one of the other routes. Whichever way you choose feel free to reach out to us if you have any questions. Even better, if you have any tips to improve this then I would love to hear them!