登录查看更多内容

How ABN AMRO decommissioned Teradata

Marcel Kramer

Director of Data Engineering and GenAI Program at ABN AMRO Bank N.V.

发布日期: 2023年4月16日

ABN AMRO continues to modernize its Data and IT landscape. After 4 years of hard work, ABN AMRO finally decommissioned Teradata, after using it for more than 20 years to host group wide Data Warehouses and to exchange and analyze (enterprise) data. Why did we spend so much time in decommissioning this platform? And what can we recommend as checks if you're exploring to do the same? Read along and find out if you are set for success.

Background of our move away from Teradata

If I have to summarize the main reasons why we wanted to move away from Teradata it would be 1) ABN AMRO's platform and cloud strategy, b) getting in control of our data and c) the increasing usage of (big) data.

Around 2019, ABN AMRO took some bold choices to simplify the IT platform and move to cloud. While simplifying and rationalizing an IT landscape in itself can be already a strategic choice, combining this with adoption of cloud technology makes it even more challenging. Everyone knows the benefits from cloud, but really reap those in real life is harder than one can imagine and requires strong engineering capabilities across the board. And in order to execute on strategy, you have to make choices. Triggered by the Platform and Cloud strategy (a), and fuelled by the need to get in control of our data (b) and seeing the usage of data exploding (c), we marked Teradata as non-target and started an ambitious decommission program. You will read more on (b) and (c) below.

How did we do it?

We started off our journey with the ambitious plan to not migrate our workloads on Teradata to the cloud, but to reengineer these workloads from scratch on Azure and adopt meanwhile our data governance and latest data distribution framework. The background of this choice is fully linked to (b): getting in control of our data. Many enterprises I spoke to indicated that after (multiple) decades of working with centralized data warehouses, the true understanding of which data is used for what, the (e2e) lineage and best practise data governance principles, such as clear data ownership, proper data sharing agreements etc. have been lost. And by the way: this is also understandable as the entire paradigm on how to work with data has massively changed over the last 5 years. This we wanted to fix, because moving away from Teradata through reengineering existing workloads from scratch on target technology would then not only help us simplify our IT landscape, it would also instantly regaining back the control over our data and put proper data governance in place.

But honestly speaking, we underestimated this.

After 20 years of usage, we found out that the business logic, who are the data providers, and who the consumers, what is IT managed, what is not IT managed etc. was unclear and poorly documented. This is obviously one of the reasons we wanted this move, but it appeared to be quite messy. Considering our ambition to reengineer from scratch, assessing all queries and business logic that was built up over the last decades manually became a timely effort.

The way we have dealt with this requires first some sort of definition. Along the way we tuned our full reengineering strategy, resulting in an adjusted approach. I'll explain the three different strategies we followed:

Reengineering remained the default strategy and we pushed as much user groups as possible into this direction. Obviously, this is the most laborious effort, but the results are also the best. We are proud that after the full decommission?90% of our user groups have rebuilt from scratch their workloads on Azure, using the latest Azure (native) tools, such as ADLS gen2, Azure Synapse, Azure Data Factory and Azure Databricks.
For a couple of particular user groups, we transpiled the Teradata native queries into Azure SQL language. Transpiling is a merge of translation and compiling, and we used a specific technology from Compilerworks to achieve this. Compilerworks was taken over by Google in the middle of our decommission program, so we only applied this to a limited number (2%) of user groups. But this approach is still worthwile to look into, as the benefit here is that your code is completely rewritten (and optimized) for target technology. But note that any flaw in the (business) logic appears in the target code as well. Consider this therefore as a technical migration, not so much a functional one.
The last 10% of our user groups, we migrated successfully with Datometry. This company creates a virtualization layer on top of Azure native technology (Synapse) which allows users of Teradata to continue their work as they were used to, but under the hood, instead of talking to a Teradata machine, you'll running your queries on the cloud. This is a particularly useful strategy if time is your enemy. We were able to virtualize some of our most business critical applications of Teradata timely to Azure by using this technology. Another benefit is that you instantly have the scalability of your cloud of choice, whereas with an on-prem Teradata setup, at some point you run against the boundaries of your (set of) appliance(s).

This tuned strategy ultimately caused a delay in our planning, which is obviously unfortunate, but let's also appreciate that after 20 years of Teradata usage, a lot has been built on top of this platform and business also needs to continue as is. That is why I am still proud of the mix we have achieved as described above. And we have now moved our data distribution platform from an Enterprise Data Warehouse setup into a properly governed and scalable setup, future proof for the ever growing load of (big) data. And so we also met our third (c) trigger.

What are the lessons we have learned and what checks we would recommend to you?

As you could already read, now that we are done, we have some experiences that we are happy to share in a simple checklist to run yourself. Have a look if you can tick all those boxes. If not: consider putting some extra attention to this aspect:

领英推荐

Single-Tenant vs Multi-Tenant Databases: Choosing the…

Convene 9 个月前

Get to know the Power of Data Management with #Amazon…

Noblesoft Technologies 11 个月前

Tessell - DBaaS Download - May 2024

Tessell 9 个月前

Is the program prioritized on enterprise level?

Since in most of the cases you're touching group wide data warehouses, trying to decommission those will impact the entire group. Ensure that you have buy in from the top, also on business case level. This is a case which cannot only be financially driven, but comprises also the belief that data is the fuel for the future and requires a modern, scalable and governed landscape to run on.

Invest in telemetry

To understand the complexity of your landscape, plan properly, identify risks and understand your stakeholders, automated telemetry, such as logs or lineage tooling, assure you have the right datapoints to plan and act. If you have to do this manually, for sure you'll be surprised along the way.

What is your target?

Make sure that you involve the users of Teradata at an early stage. This helps you understand their requirements and assess those against target technology. Especially end user maintained data marts contain exotic features which you need to have an alternative (and performing!) technology for.

What is your test strategy?

Especially when you're choosing not to reengineer but to transpile or migrate: create a proper test strategy jointly with the business. In our case this learned us that certain queries had to show exact matches up and until 6 digits behind the comma. Just to give you a flavour on how important test requirements are and the impact it can have on the work you have to do.

Partner with experts

As you can see in the picture: Teradata is a physical box, tuned for optimal query performance. It took us quite some engineering effort to achieve similar results on Azure. Ensure that you have a solid partnership with software or infra vendors (ie Microsoft, Databricks or -in our case- Datometry) when you embark on this journey. I feel it is fair to say that we did some unprecedented achievements together with these vendors in the industry.

I hope this helps and gives you food for thought. ABN AMRO is always willing to share our experiences.

The next update will be on Cloudera's Hadoop platform. Stay tuned ;)

No alt text provided for this image — Wrapping an era

Joakim Hilj

Talent wins games, but teamwork and intelligence wins championships - Michael Jordan

1 个月

Amazing story! Thanks!

Sjors Houben

Data Scientist

11 个月

Twan Heijmen

Ari Lehner

Helping world's best companies turn data into "gold"

1 年

Very valuable share - thank you Marcel Kramer

Ivo Everts

Lead Data & AI Strategist - Energy

1 年

Joakim Hilj

查看更多评论

要查看或添加评论，请登录

Marcel Kramer的更多文章

2022: a leap in modernizing our data landscape

2022年12月31日

2022: a leap in modernizing our data landscape

When looking back on 2022, it is undeniable that ABN AMRO has made huge steps in modernizing its data landscape to…

16 条评论

How ABN AMRO decommissioned Teradata

Marcel Kramer

Director of Data Engineering and GenAI Program at ABN AMRO Bank N.V.

领英推荐

Marcel Kramer的更多文章

社区洞察

其他会员也浏览了

RisingWave Newsletter November 2023

Storage and Data Protection News for the Week of November 8; Updates from Infinidat, Pure Storage, Veeam & More

Build a Scalable Data Warehouse On-Prem, at 70% Less Cost with Vertica and Cloudian

Azure Blob cost optimization strategies

Tessell Monthly Recap - June 2023

Tips for a Smooth Cloud Data Migration Process

Cloudera and Nutanix: Is Nutanix the Best Solution for Big Data?

What the C-Suite Should Know About Snowflake

How consistent storage services across all tiers and platforms attains data simplicity, compatibility, and lower cost

How Tessell Helps Channel Partners Modernize Data Platforms

领英推荐

Marcel Kramer的更多文章

2022: a leap in modernizing our data landscape

社区洞察

其他会员也浏览了

RisingWave Newsletter November 2023

Storage and Data Protection News for the Week of November 8; Updates from Infinidat, Pure Storage, Veeam & More

Build a Scalable Data Warehouse On-Prem, at 70% Less Cost with Vertica and Cloudian

Azure Blob cost optimization strategies

Tessell Monthly Recap - June 2023

Tips for a Smooth Cloud Data Migration Process

Cloudera and Nutanix: Is Nutanix the Best Solution for Big Data?

What the C-Suite Should Know About Snowflake

How consistent storage services across all tiers and platforms attains data simplicity, compatibility, and lower cost

How Tessell Helps Channel Partners Modernize Data Platforms