How ABN AMRO decommissioned Teradata
ABN AMRO continues to modernize its Data and IT landscape. After 4 years of hard work, ABN AMRO finally decommissioned Teradata, after using it for more than 20 years to host group wide Data Warehouses and to exchange and analyze (enterprise) data. Why did we spend so much time in decommissioning this platform? And what can we recommend as checks if you're exploring to do the same? Read along and find out if you are set for success.
Background of our move away from Teradata
If I have to summarize the main reasons why we wanted to move away from Teradata it would be 1) ABN AMRO's platform and cloud strategy, b) getting in control of our data and c) the increasing usage of (big) data.
Around 2019, ABN AMRO took some bold choices to simplify the IT platform and move to cloud. While simplifying and rationalizing an IT landscape in itself can be already a strategic choice, combining this with adoption of cloud technology makes it even more challenging. Everyone knows the benefits from cloud, but really reap those in real life is harder than one can imagine and requires strong engineering capabilities across the board. And in order to execute on strategy, you have to make choices. Triggered by the Platform and Cloud strategy (a), and fuelled by the need to get in control of our data (b) and seeing the usage of data exploding (c), we marked Teradata as non-target and started an ambitious decommission program. You will read more on (b) and (c) below.
How did we do it?
We started off our journey with the ambitious plan to not migrate our workloads on Teradata to the cloud, but to reengineer these workloads from scratch on Azure and adopt meanwhile our data governance and latest data distribution framework. The background of this choice is fully linked to (b): getting in control of our data. Many enterprises I spoke to indicated that after (multiple) decades of working with centralized data warehouses, the true understanding of which data is used for what, the (e2e) lineage and best practise data governance principles, such as clear data ownership, proper data sharing agreements etc. have been lost. And by the way: this is also understandable as the entire paradigm on how to work with data has massively changed over the last 5 years. This we wanted to fix, because moving away from Teradata through reengineering existing workloads from scratch on target technology would then not only help us simplify our IT landscape, it would also instantly regaining back the control over our data and put proper data governance in place.
But honestly speaking, we underestimated this.
After 20 years of usage, we found out that the business logic, who are the data providers, and who the consumers, what is IT managed, what is not IT managed etc. was unclear and poorly documented. This is obviously one of the reasons we wanted this move, but it appeared to be quite messy. Considering our ambition to reengineer from scratch, assessing all queries and business logic that was built up over the last decades manually became a timely effort.
The way we have dealt with this requires first some sort of definition. Along the way we tuned our full reengineering strategy, resulting in an adjusted approach. I'll explain the three different strategies we followed:
This tuned strategy ultimately caused a delay in our planning, which is obviously unfortunate, but let's also appreciate that after 20 years of Teradata usage, a lot has been built on top of this platform and business also needs to continue as is. That is why I am still proud of the mix we have achieved as described above. And we have now moved our data distribution platform from an Enterprise Data Warehouse setup into a properly governed and scalable setup, future proof for the ever growing load of (big) data. And so we also met our third (c) trigger.
What are the lessons we have learned and what checks we would recommend to you?
As you could already read, now that we are done, we have some experiences that we are happy to share in a simple checklist to run yourself. Have a look if you can tick all those boxes. If not: consider putting some extra attention to this aspect:
领英推荐
Is the program prioritized on enterprise level?
Since in most of the cases you're touching group wide data warehouses, trying to decommission those will impact the entire group. Ensure that you have buy in from the top, also on business case level. This is a case which cannot only be financially driven, but comprises also the belief that data is the fuel for the future and requires a modern, scalable and governed landscape to run on.
Invest in telemetry
To understand the complexity of your landscape, plan properly, identify risks and understand your stakeholders, automated telemetry, such as logs or lineage tooling, assure you have the right datapoints to plan and act. If you have to do this manually, for sure you'll be surprised along the way.
What is your target?
Make sure that you involve the users of Teradata at an early stage. This helps you understand their requirements and assess those against target technology. Especially end user maintained data marts contain exotic features which you need to have an alternative (and performing!) technology for.
What is your test strategy?
Especially when you're choosing not to reengineer but to transpile or migrate: create a proper test strategy jointly with the business. In our case this learned us that certain queries had to show exact matches up and until 6 digits behind the comma. Just to give you a flavour on how important test requirements are and the impact it can have on the work you have to do.
Partner with experts
As you can see in the picture: Teradata is a physical box, tuned for optimal query performance. It took us quite some engineering effort to achieve similar results on Azure. Ensure that you have a solid partnership with software or infra vendors (ie Microsoft, Databricks or -in our case- Datometry) when you embark on this journey. I feel it is fair to say that we did some unprecedented achievements together with these vendors in the industry.
I hope this helps and gives you food for thought. ABN AMRO is always willing to share our experiences.
The next update will be on Cloudera's Hadoop platform. Stay tuned ;)
Talent wins games, but teamwork and intelligence wins championships - Michael Jordan
1 个月Amazing story! Thanks!
Data Scientist
11 个月Twan Heijmen
Helping world's best companies turn data into "gold"
1 年Very valuable share - thank you Marcel Kramer
Lead Data & AI Strategist - Energy
1 年Joakim Hilj