Migrating Exadata workloads to AWS, part one
The objective of this article is to share my personal experience with migrating Exadata workloads. I have been working with Oracle databases for two decades, out of which I have been leading Database Solutions teams at Paysafe for the last 10 years, helping the company grow in a secure and frictionless way. In 2019 the company took the strategic decision to migrate critical workloads to the cloud in order to benefit from its resilience, elasticity and ability to deliver business value at batter pace.
While the application stack fits very well into AWS, Paysafe’s big, monolith databases are definitely a challenge. Actually - lots of challenges. Rather than providing a step-by-step guide for the migration, I am sharing my learnings from this project, which I wish I could have read about in an article 2 years ago, before we started the migration journey.
Spoiler: if you are only interested in the technology piece, you can skip this article and start directly with part two.
Cloud is NOT just somebody else's computer
Here in the Paysafe Database Solutions team, we are very good at what we do. We have more than 200 years of combined experience and knowledge running Oracle and MS SQL databases on-prem. We eat Exadata for breakfast and can do Active DataGuard switchovers in our sleep. We have detailed documented procedures for everything - from deployment to monitoring, from patching to troubleshooting performance. A lot of those procedures need massive rework with the migration to AWS. However, the biggest rework of all is the mindset.
The cloud is actually a set of procedures and processes, approaches and solutions to problems, that are either not possible to achieve or very hard on-prem. Officially, there are 6 ways tackle cloud migrations - the six R's: Re-host, a.k.a. lift-and-shift; Re-platform, while keeping the overall architecture; Re-architect, using cloud-native approaches; Re-purchase, which is basically moving to SaaS; Retire unneeded components; or Retain, stay on-prem.
I can, however, propose a different view. The migration to the cloud can be done in 3 ways: the good, the bad and the ugly. I will describe those in reverse order.
"The ugly" way to go in the cloud
There are people that believe that cloud is just someone else's computer. At the end of the day, any cloud vendor offers basic infrastructure components, like "a server" and "some storage". What else do you need? Let's provision a pile of EC2 instances, install whatever we have on-prem and declare successful migration.
There are two problems with this approach. First, you are not really in the cloud - you are on someone else's computer. You do not really use the elasticity and resilience, automated monitoring and security. Your IT staff will not benefit from such a move, nor your apps, nor customers. Even worse, this will become one very inefficient, therefore expensive endeavour. It's like replacing your fleet of trucks with a fleet of cool Ferraris and still delivering the same boxes. Not only is it more expensive, but it’s actually slower (a Ferrari boot is really small).
"The bad" way to go in the cloud
Every big enterprise has a wide variety of workloads. Some are small and nimble, others are monolith and full of technical debt. While the former is a good fit for the cloud, we cannot leave the latter behind.
For these cases we have the re-platform approach. This allows us to use our good old Oracle database in RDS, while still offloading the boring maintenance tasks like patching, backup and DR to AWS. Not a perfect solution, but it still has potential to free up some time for your DBAs to do business-specific optimizations instead of changing tapes in a tape library.
There are a lot of hidden issues with this approach and limitations. And to be honest, this can also be quite expensive in the long run. More on that in the next article.
"The good" way to go in the cloud
The best you can do with your workloads is re-architect them using cloud-native services. Go serverless, use cloud-native components! Be super-elastic, use only what you need, when you need it. Utilize all the resilience and elasticity a cloud can offer. CI/CD, infrastructure-as-a-code and so on. For databases, consider things like DynamoDB and Aurora Serverless. This will result in great savings and will future proof your product, right?
While this is great approach, it is hardly applicable to legacy systems with tons of code. I cannot imagine any business saying "OK, let's stop delivering features for our customers, and rewrite everything from scratch, because cloud is so cool"
"The right" way to go to the cloud
I am afraid there is no magic bullet. You need to put your workloads in different buckets. Try to use "the good" approach as much as possible - identify smaller workloads that are almost ready or small enough to re-architect, burry the low-hanging fruits. If the work becomes too much for some specific package; or you do not have the time or budget right now, try using "the bad" way as a first step to AWS. Combining enough "good" with some "bad" should give you the right mix to make the cloud move attractive for the business. Once you finish the initial migration, you should start re-achitecting and optimizing. Important: this means that during your migration, the business has to pay for both, on-prem (still not shut down) and inefficient use of the cloud for all "bad" components. That's why approaching a cloud migration is a way to get more stability, elasticity and future savings. It will be expensive in the beginning - the real saving will come gradually, as the implementation matures.
And then there are some workloads that simply need to be done "the ugly" way. Try to keep this really as a last resort - as people will be tempted to choose this approach.
The key to successful migration are the people
The most important part of any successful migration are the people doing it. You need people who have both the skill and the will to do it. I am not sure which one is more important.
The skill
Lacking the skill may look like an obvious trap, but there is more to it than it seems. Here I will note down some interesting, real-world challenges.
Training it a great start but it is not a substitute for years of experience. There are lots of training resources, some are good, others - not that good. I have suffered through training courses that didn’t add any value (to me – it may be different for you); I have also enjoyed great courses that gave me a lot of structured understanding. What worked best for me is:
- A week of "get your hands dirty" training in our office, personalized for our team and delivered by Nicholas Walter from OpsCompas. Nick really knows how to explain the basics and helped us to hit the ground running. Plus, all the follow-up support he provided.
- The "Architecting on AWS" 3-days classroom training delivered by Alan McGinlay from AWS, packing tons of knowledge around wide area of services in an easily understandable package.
- "Ultimate AWS Certified Solutions Architect Associate" by Stephane Maarek in Udemy. In fact, all his courses I purchased are great (I also completed the "Architect Professional" and the "DB specialty" course). But the Architect Associate course is even better because of the hands-on exercises throughout the course.
Following the training, however, comes the next challenge: over-engineering. Inspired by all the newly acquired knowledge and ideas, some people may start generating architecture diagrams full of different serverless components ("The good" way), that have nothing to do with the current application. This can greatly jeopardize the project timelines and budget.
Another trap is to come up with all those cool graphics without considering this cool pile of services has to be operated. The question here is "Who owns this thing?". People drawing pretty pictures are usually not the ones being on the 24x7 support schedule, nor the ones explaining the backup strategy in the next SOX audit.
The will
Another important part that can easily be overlooked is the buy-in from the teams. Going to the cloud is a huge, major change. Most people do not feel comfortable with change and are naturally inclined to push back. There are many reasons for this. I am not going to dive deeply into the areas of motivation and fear of change, but will mention some notable challenges related to cloud migration.
Firstly, our current support procedures, which have been built by years of blood and tears, are no longer valid. The result can range from discomfort to real stress. This is one of the main reasons for people selecting "the ugly" way to migrate. This allows them to stick to known processes instead of embracing big changes. Getting the buy-in form the team should go through the realization, that yes, there will be challenges initially but we eventually get to a better place. We have the opportunity to shape the future procedures, but have to forget the old way of doing things.
Second trap is management bringing in some highly paid experts to deliver "the thing" while Ops teams keep on doing the mundane tasks. And we know who's going to support this after the consultants take their paycheck, right? The right approach is to engage the company staff to contribute early on and build this new shiny thing, so that they enjoy owning it once it goes live. The external consultants should not build the cloud deployment; they should ‘build’ the company engineers, who will build the cloud deployment.
The third trap is to base the cloud migration business case simply on potential staff savings. "Once we deliver it, we can run it with XX% less engineers". Such words can motivate the CFO, however, not the engineers building it, who may find themselves being made redundant after the solution is deployed. Also, this way of thinking is wrong, especially in such fast-paced sectors like financial services. The better motivator is, how many new services can we deliver, at a better pace, to achieve a competitive edge - this should work even for the CFO, but sounds less tangible.
The next article in this series is about the most common choice for migrating Oracle workload to AWS - using RDS service
Cloud Security Expert & Consultant
3 年Jiri Kram Garry Meaburn excellent article..
On NP | Lead DevOps @ Geidea | EU Bulgarian PR
3 年Yavor Ivanov One of the best quote. "The question here is "Who owns this thing?". People drawing pretty pictures are usually not the ones being on the 24x7 support schedule, nor the ones explaining the backup strategy in the next SOX audit."
CPTO | Global experience in Payments, Banking, Fintech and Crypto | Public and Startup background with Zero to One experience
3 年Helpful framework for cloud migrations and valuable insights on approach and training material that helped you. Well worth the read for teams considering Oracle migrations to AWS
Head of Sales at Umony
3 年Pravesh Bhardwaj
Oracle techno-functional consultant
3 年Thanks for excellent read Yavor! Few questions: Given that we are moving faster than expected from current centralized payment systems(for fiat currencies) to decentralized ones(blockchain based for digital crypto currencies) on massive scale, is this worrying you? Do you see yourself like blockchain system architect in the future considering your journey from on-prem DBA to Cloud Data Architect? My point is about medium of exchange.