登录查看更多内容

Historical Data Tracking in PostgreSQL - Part 1: Historical Table and Triggers

Jaime Martínez Verdú

Ingeniero y Analista de Datos | Experto en Transformación Digital y Estrategia de Datos

发布日期: 2023年7月30日

Introduction to Historical Data Tracking in PostgreSQL

Historical data tracking, also known as #historification, plays a crucial role in maintaining a comprehensive record of changes made to data over time. This process allows us to access previous versions of data and analyze its evolution, providing valuable insights for various business scenarios. For instance, consider a scenario where a company needs to monitor how product characteristics, such as quantity, category distribution, average price, and inventory availability, have evolved over time.

While some relational database systems offer built-in support for historification, #PostgreSQL does not natively include such a feature. However, PostgreSQL's versatility and support for stored procedures enable us to implement a custom historical data tracking process using triggers and functions.

There may be other alternatives or approaches to achieving historification in PostgreSQL, such as using extensions or external tools, but these aspects will not be mentioned here.

Defining the Historical Table

In PostgreSQL, we create a historical table to store the historical versions of our original table. The historical table's name is the same as the corresponding table, with the addition of the "_hist" suffix. For instance, if our original table is called 'products' and is included in the 'stg' schema, the historical table will be named 'stg.products_hist'.

The structure of the historical table closely mirrors that of the original table, with one crucial difference: it contains three additional columns to capture historical data for each record:

start_date: This column records the date when a record becomes active.
end_date: This column stores the date when a record ceases to be active.
deleted: This column helps identify whether a record has been deleted.

Pro Tip: To simplify maintenance and avoid modifications in multiple parts of the code when adding new fields, it is advisable to define these three additional columns as the first fields of the historical table.

领英推荐

Upgrading ReportPortal with backup/restore of Postgres…

Gaurav Singh 3 个月前

From Planning to Performance: Your Ultimate Guide to…

CST - Cyber Sapient 1 个月前

High-Performance PostgreSQL: A Dive Into the Internals

Tamer Khraisha (Ph.D) 2 年前

Defining Triggers

#Triggers in PostgreSQL allow us to define automatic actions in response to specific events occurring on a table. They are powerful tools for enforcing business rules, maintaining data integrity, or triggering additional tasks whenever certain operations are performed on the associated table.

In our context, triggers are employed to automatically capture and record changes made to data in the table, facilitating the creation of version history. We have defined three triggers associated with the 'products' table in the 'stg' schema:

CREATE TRIGGER trg_products_insert
AFTER INSERT ON stg.products?
FOR EACH ROW?
EXECUTE FUNCTION dwh.trg_fnc_products_insert();

CREATE TRIGGER trg_products_update?
AFTER UPDATE ON stg.products?
FOR EACH ROW?
EXECUTE FUNCTION dwh.trg_fnc_products_update();

CREATE TRIGGER trg_products_delete?
BEFORE DELETE ON stg.products?
FOR EACH ROW?
EXECUTE FUNCTION dwh.trg_fnc_products_delete();

Each set of triggers follows a similar structure, explained below:

Each trigger is associated with a table, in this case, 'stg.products'.
The trigger is activated after an INSERT, UPDATE, or DELETE operation on the table, depending on its purpose.
For each inserted, updated, or deleted row, the corresponding trigger function is executed: 'stg.trg_fnc_products_insert()', 'stg.trg_fnc_products_update()', and 'stg.trg_fnc_products_delete()'.
The trigger's purpose is to perform certain actions or validations either before or after a new row is inserted, updated, or deleted in the table.

The triggers defined are responsible for maintaining the historical record using the mentioned trigger functions whose operation is summarized below:

When an insertion occurs, the corresponding trigger captures the new data and inserts it into the historical table. It sets the 'start_date = now()' for the record (indicating when it became active) and sets the 'end_date' to NULL as the record is still active.
Similarly, when an update is made to any record, the corresponding trigger marks the previous version as completed in the historical table by assigning 'end_date = now()' which indicates that the old record has expired. The trigger then inserts a new entry with the updated data, following the same steps as explained above.
In the case of a deletion in the table, the associated trigger logs this action in the historical table by setting the 'deleted' value to true and assigning an 'end_date = now()' to the corresponding record. This indicates that the record has expired.

Stay tuned for the next article where we will delve deeper into the trigger functions 'stg.trg_fnc_products_insert()', 'stg.trg_fnc_products_update()', and 'stg.trg_fnc_products_delete()', providing an in-depth explanation of each.

Keep learning and exploring the fascinating world of historical data tracking in PostgreSQL! #DataTracking #Historification #PostgreSQLTriggers #PostgreSQL #HistoricalData #DataManagement #Triggers

要查看或添加评论，请登录

Jaime Martínez Verdú的更多文章

Ensuring Data Integrity: Query Modification and Verification Strategies

2023年12月16日

Ensuring Data Integrity: Query Modification and Verification Strategies

As data analysts and engineers working extensively with #PostgreSQL, the process of modifying existing queries within a…
Dynamic Code vs. Hardcoding SQL

2023年10月8日

Dynamic Code vs. Hardcoding SQL

In the world of database management, #PostgreSQL stands out as a powerful open-source relational database system…
Historical Data Tracking in PostgreSQL - Part 3: ETL Process Adjustment

2023年9月24日

Historical Data Tracking in PostgreSQL - Part 3: ETL Process Adjustment

Introduction In the first two parts of this series, we explored how to set up historical data tracking in #PostgreSQL…

1 条评论
Historical Data Tracking in PostgreSQL - Part 2: Trigger Functions

2023年8月4日

Historical Data Tracking in PostgreSQL - Part 2: Trigger Functions

Introduction In the first article of this series, we explored the concept of historical data tracking in #PostgreSQL…
Efficient Querying of Historical Data in a Database

2023年7月15日

Efficient Querying of Historical Data in a Database

Introduction In this article, we will explore the process of querying historical data stored in a database table in…

1 条评论
Leveraging LAG() for Accurate Click-Through Rate Analysis

2023年6月12日

Leveraging LAG() for Accurate Click-Through Rate Analysis

In today's data-driven world, organizations are increasingly focusing on leveraging data analysis to drive positive…
The story of how I optimized a SQL code to improve our data analysis in my company

2023年5月14日

The story of how I optimized a SQL code to improve our data analysis in my company

As a member of the data analytics team at ClimateTrade?, I recently encountered a challenge: a view that was being used…

3 条评论
TRABAJO DE AUDITORíA A MECANIZADOS DEL VINALOPó S.L. Y FORMULACIóN DE PROPUESTAS DE MEJORA

2016年5月7日

TRABAJO DE AUDITORíA A MECANIZADOS DEL VINALOPó S.L. Y FORMULACIóN DE PROPUESTAS DE MEJORA

Os presento el estudio de consultoría / auditoría empresarial que realicé a mi empresa hace un par de a?os como Trabajo…

3 条评论
Toda historia tiene su final, en este caso, feliz. Pero antes,...

2014年9月19日

Toda historia tiene su final, en este caso, feliz. Pero antes,...

Me gustaría agradecer la dedicación de todos y cada uno de los profesores del ?#?MBAUA? pues me han ense?ado una forma…

1 条评论
KeraWear

2014年6月17日

KeraWear

Hoy me he llegado un gran sorpresa cuando me he enterado de que KeraWear ha sido el vídeo finalista del concurso…

See all articles

Historical Data Tracking in PostgreSQL - Part 1: Historical Table and Triggers

Jaime Martínez Verdú

Ingeniero y Analista de Datos | Experto en Transformación Digital y Estrategia de Datos

Introduction to Historical Data Tracking in PostgreSQL

Defining the Historical Table

领英推荐

Defining Triggers

Jaime Martínez Verdú的更多文章

社区洞察

其他会员也浏览了

A Step-by-Step Guide to Installing Trino for Data Migration

Step-by-Step Guide to Setting Up a PostgreSQL Database and User for Application Development

Postgres for Everything

Over-indexing

Working with JSON data in PostgreSQL

Mastering PostgreSQL Configuration Settings: A Comprehensive Guide

Zerodha uses PostgreSQL

BYTEA vs Large Objects (LOBs) in PostgreSQL

Can eBPF Provide Real-Time PostgreSQL Insights Without Degrading Performance?

Mastering Row Locks in PostgreSQL: Ensuring Data Integrity and Performance

Introduction to Historical Data Tracking in PostgreSQL

Defining the Historical Table

领英推荐

Defining Triggers

Jaime Martínez Verdú的更多文章

Ensuring Data Integrity: Query Modification and Verification Strategies

Dynamic Code vs. Hardcoding SQL

Historical Data Tracking in PostgreSQL - Part 3: ETL Process Adjustment

Historical Data Tracking in PostgreSQL - Part 2: Trigger Functions

Efficient Querying of Historical Data in a Database

Leveraging LAG() for Accurate Click-Through Rate Analysis

The story of how I optimized a SQL code to improve our data analysis in my company

TRABAJO DE AUDITORíA A MECANIZADOS DEL VINALOPó S.L. Y FORMULACIóN DE PROPUESTAS DE MEJORA

Toda historia tiene su final, en este caso, feliz. Pero antes,...

KeraWear

社区洞察

其他会员也浏览了

A Step-by-Step Guide to Installing Trino for Data Migration

Step-by-Step Guide to Setting Up a PostgreSQL Database and User for Application Development

Postgres for Everything

Over-indexing

Working with JSON data in PostgreSQL

Mastering PostgreSQL Configuration Settings: A Comprehensive Guide

Zerodha uses PostgreSQL

BYTEA vs Large Objects (LOBs) in PostgreSQL

Can eBPF Provide Real-Time PostgreSQL Insights Without Degrading Performance?

Mastering Row Locks in PostgreSQL: Ensuring Data Integrity and Performance