Historical Data Tracking in PostgreSQL - Part 2: Trigger Functions
Jaime Martínez Verdú
Ingeniero y Analista de Datos | Experto en Transformación Digital y Estrategia de Datos
Introduction
In the first article of this series, we explored the concept of historical data tracking in #PostgreSQL and discussed the importance of maintaining historical records in certain scenarios. To achieve this functionality in PostgreSQL, where native support for #DataHistorification is absent, we rely on #TriggerFunctions. In this article, we will delve into the definition and implementation of trigger functions, which play a crucial role in managing historical data.
Defining Trigger Functions
Trigger functions in PostgreSQL are user-defined functions that are automatically executed in response to specified events, such as INSERT, UPDATE, or DELETE operations on a table. These functions allow us to capture these events and perform custom actions, making them a powerful tool for #DataTracking.
Let's take a look at trigger functions we need in our historification process:
stg.trg_fnc_products_delete()
This function captures DELETE operations on the 'stg.products' table and updates the corresponding historical record in 'stg.products_hist' by setting the 'end_date' to the current timestamp and marking it as 'deleted = true'. Here's the code breakdown:
CREATE OR REPLACE FUNCTION stg.trg_fnc_products_delete(
RETURNS TRIGGER
LANGUAGE plpgsql
AS $function$
BEGIN
UPDATE stg.products_hist SET end_date = NOW(), deleted = TRUE
WHERE id = OLD.id AND end_date IS NULL;
RETURN OLD;
END;
$function$;)
The 'UPDATE' statement within the function sets the 'end_date' column to the current timestamp using the 'NOW()' function, indicating the end of validity for the expired historical record. It also marks the record as deleted by setting 'deleted = true'. The condition 'WHERE id = old.id AND end_date IS NULL' ensures that only the active historical record associated with the deleted record is updated. The 'OLD' keyword refers to the deleted row from the 'stg.products' table.
stg.trg_fnc_products_insert()
This function captures INSERT operations on the 'products' table and copies the newly inserted row into the 'products_hist' table, thereby creating a historical record. The code is as follows:
领英推荐
CREATE OR REPLACE FUNCTION stg.trg_fnc_products_insert(
RETURNS TRIGGER
LANGUAGE plpgsql
AS $function$
BEGIN
INSERT INTO stg.products_hist SELECT NOW(), NULL, FALSE, NEW.*;
RETURN NEW;
END;
$function$;
The 'INSERT INTO' statement inserts a new row into the historical table. The 'SELECT NOW(), NULL, FALSE, NEW.' statement selects the values from the newly inserted row ('NEW.') and adds the current timestamp ('start_date = NOW()'), a NULL value for 'end_date', and FALSE for 'deleted' as additional columns for the insertion. The 'NEW' keyword refers to the inserted row from the 'products' table.
stg.trg_fnc_products_update()
This function handles UPDATE operations on the 'products' table by updating the 'end_date' of the previous version and inserting the updated row as a new version in the 'products_hist' table, ensuring historical tracking of data changes:
CREATE OR REPLACE FUNCTION stg.trg_fnc_products_update(
RETURNS TRIGGER
LANGUAGE plpgsql
AS $function$
BEGIN
UPDATE stg.products_hist SET end_date = NOW()
WHERE id = NEW.id AND end_date IS NULL;
INSERT INTO stg.products_hist SELECT NOW(), NULL, FALSE, NEW.*;
RETURN NEW;
END;
$function$;)
The function consists of two main parts:
General Recommendations for Creating Trigger Functions for Historical Data Management
Trigger functions in PostgreSQL are powerful tools that enable us to create historical records and maintain a trail of data changes. By defining appropriate trigger functions, we can effectively track and manage historical data in the absence of native support.
In the next article, we'll dive into the exciting world of ETL processes and how they seamlessly integrate with historified tables. By the end of the series, you will have a comprehensive understanding of historical data management and be equipped with the knowledge to implement it effectively in your own projects.
Stay curious and stay committed to optimizing your data management practices! We look forward to seeing you in the final chapter of our Historical Data Tracking series.