登录查看更多内容

We Were Happy and Didn't Know It

Pedro Castellanos

Information Technology Architect at S&P Global

发布日期: 2024年10月19日

Many years ago, we had a large central system and several satellite systems that faithfully sent their data via extracted files to our Data Warehouses (DWH) at night or several times during the day. This data was processed and organized in complex three-layer Data Warehouses, culminating in an analytical layer where our Business Intelligence (BI) systems did their work.

We had a Mainframe system or an ERP that, through ETL tools like Datastage or Informática, sent data to databases such as DB2, Teradata, or Netiza. This data was rationalized and unified by topic, independent of the source system, using Stored Procedures or the ETL tools themselves. This led us to star or snowflake schemas, where Business Objects, Microstrategy, or Cognos provided us with dynamic reports, fixed paginated reports, and interactive dashboards.

Although these tools were costly, they were integrated: execution control, access control, workflow, security down to row level, an integrated catalog in the data handler, data quality (DQ) mechanisms, and reconciliation, among others. We had everything in just a few systems.

Then Big Data came along, promising cheap storage with acceptable I/O times and data redundancy that guaranteed at least three nines of availability. This meant having hundreds of micro servers in racks to create nodes, ensuring the redundancy needed to handle gigabyte files.

Someone had the idea to emulate a database by transforming files into hyper-indexed tables like Parket and emulating schemas with open-source systems like Hadoop or Impala. Being open systems, there were no licensing costs, just the equipment and labor.

领英推荐

Connecting SQL Server and Ultipa Graph

Ultipa 9 个月前

Utilizing REST APIs within MS SQL Server: A Practical…

intertoons Internet services pvt ltd. 11 个月前

How to Export Data from SQL Server to Another Database…

Jackson Andrew 5 个月前

We succumbed to the temptation and abandoned our loyal providers to go all in on Big Data.

Soon, we realized that maintaining all that hardware was madness, and we were offered a journey to the clouds. In the cloud, we no longer had to worry about so much hardware, and we also had additional tools to manage our data in S3 buckets.

However, we still needed something to serve as a database or DWH, so we opted for solutions like Snowflake, Cloudera, or Databricks to emulate our DWH. But we also needed an access manager, and a Data Catalog, to replace our ETL with SparkSQL, rewrite our code, and coordinate our processes.

In summary, everything had to be acquired separately. Our old BI systems also needed to be updated or acquire "Cloud Friendly" versions.

Ten years and millions of dollars later, we almost reached the same level of cutting-edge technology we had before.

Bene Archbold

Data & AI Specialist @ IBM

4 个月

great article and well said!

1 次回应

要查看或添加评论，请登录

Pedro Castellanos的更多文章

We need to bring back The Glory of God

2024年12月22日

We need to bring back The Glory of God

Imagine walking through the vast halls of the Metropolitan Museum of Art, surrounded by centuries of human creativity…
!éramos Felices y No lo Sabíamos!

2024年10月17日

!éramos Felices y No lo Sabíamos!

Hace muchos a?os, contábamos con un gran sistema central y algunos sistemas satelitales que enviaban sus datos de…
Look Mama: No Vlookups!!!

2024年5月4日

Look Mama: No Vlookups!!!

My grandma was a living encyclopedia of popular sayings. She was like a Swiss Army knife, ready to whip out the perfect…

3 条评论
“Excel Made me Do it!” A Corporate America horror Story.

2024年4月12日

“Excel Made me Do it!” A Corporate America horror Story.

Any similarity with real life office situations is marely coincidental. I have always had a saying: "All IT projects in…

3 条评论

We Were Happy and Didn't Know It

Pedro Castellanos

Information Technology Architect at S&P Global

领英推荐

Pedro Castellanos的更多文章

社区洞察

其他会员也浏览了

How to Enable SQL Insights (preview) to monitor your SQL deployments

SSIS

The changing landscape of Data - The good and the scary ones

Onsite@ IN - Snowflake Developer - No H1/CPT

Architecting Data Migration: Efficiently Loading Data from Oracle (On-Premises) to Snowflake

Components of the SQL Server Architecture

SQL Query Performance Improvement in SQL Server

DATA LOADING AUTOMATION USING MICROSOFT SSIS

SSIS

What is SSIS?

领英推荐

Pedro Castellanos的更多文章

We need to bring back The Glory of God

!éramos Felices y No lo Sabíamos!

Look Mama: No Vlookups!!!

“Excel Made me Do it!” A Corporate America horror Story.

社区洞察

其他会员也浏览了

How to Enable SQL Insights (preview) to monitor your SQL deployments

SSIS

The changing landscape of Data - The good and the scary ones

Onsite@ IN - Snowflake Developer - No H1/CPT

Architecting Data Migration: Efficiently Loading Data from Oracle (On-Premises) to Snowflake

Components of the SQL Server Architecture

SQL Query Performance Improvement in SQL Server

DATA LOADING AUTOMATION USING MICROSOFT SSIS

SSIS

What is SSIS?