Lets get into Software Engineering at CRUK: Part 4 - Building In-house Online Payment Solutions (PWS)
Cancer Research UK (CRUK) is the world's leading independent cancer charity dedicated to saving lives through research, influence and information. Our vision is for a better future, to bring about a world where everybody can lead longer, better lives, free from the fear of cancer.
The part we play in this is as the Online Payments product team. We’re responsible for the payments platform that takes all donations to CRUK ‘in-house’ - think donation forms for a one-off or regular donation, paying to take part in an event, buying from our shop or donating to a friend's fundraising page. ?
Excitingly, we’ve just completed an overhaul of our online payments product and we want to share a bit about that journey. This work has been a ground-breaking technical approach for CRUK and has fundamentally changed the way that we and other teams work and communicate cross-org.
For context, our legacy payment platform was built by a third-party agency in 2014 to accommodate payments, single donations, and regular donations through direct debits. Although this served us for a while, as the income through this platform grew and online payments became an ever-evolving feature of our modern-day society, we faced some issues.?
Understanding the problem
The team spent time un-picking the problems being faced with our current solution by speaking to our internal and external users and supporters, carrying out user journey mapping, reviewing data flows and piecing together all the feedback we had available to build a picture of the current performance.
This discovery work helped us to identify some of the biggest problems we were facing with our legacy product. To start, we didn’t own the core capability in-house, meaning we couldn’t directly improve the user experience for our supporters and internal users (other product teams). It had become a one-size-fits-all approach that was hard to scale and had a complicated release process. Every time we did a release, we had to bring all payments offline for the duration. The architecture, which was originally fit for purpose, became increasingly complex after years of adding new features to meet the evolving needs of the organisation. As a result, it couldn’t withstand very high volumes of traffic and things became very slow. The product and its architecture became very expensive to run and this wasn’t going to work for us in the long term.
Using the insights gathered, we built a case to highlight the pain points of the existing application and the potential benefits we could see by making some changes. We proposed bringing the product in-house and creating a new payments product that would better serve internal and external users. We felt that this would ultimately help us deliver the most value to our supporters. This led us to create Payment Web Services (PWS) as our new in-house online payment product.?
Payment Web Services (PWS)
Our mission: To allow anyone at any time to easily give money to CRUK online, so together we will beat cancer.
Payments Web Services (PWS) is now the online payment solution at CRUK and we have decommissioned our legacy product. PWS allows us to offer payments via a donation form for one-off donations, fundraising donations, and regular donations. We also support donations and payments for other products e.g. payments for our e-commerce sites, signing up for an event, or donating to a fundraising page.
This is a cloud-native solution built on AWS using Typescript and Next.js. Our PWS application architecture has the following key features:
The Payments product consists of two distinct parts, which are developed and deployed independently:
Building infrastructure as code
We build our infrastructure using Amazon Cloud Development Kit (AWS CDK). This allows us to expressively define infrastructure using a programming language rather than a data-serialization language. This has many benefits, such as modularization, testability, maintainability, and safe commenting.
Technologies
Using Typescript
Typescript’s versatility, scope, and popularity meant that we could use the mature tooling from the ecosystem to move quickly and confidently, and a common language also meant that our engineers could work across the stack. Out of the box, it gave us build-time type-safety in our Node.js microservices, libraries, and our Next.js application. Type-checking our code at build time removed a whole category of errors and gave us a lot of confidence when making changes to our applications and when growing our team. We have also started introducing schema validation to more places in our application boundaries to get runtime type-safety, so we can ensure the applications and services behave in predictable ways.
PWS Deployment Process:
领英推荐
Continuous Delivery
Continuous delivery is the culmination of continuous integration (CI) and continuous deployment (CD).
We practice continuous delivery by making sure our software is always in a releasable state and by prioritising the maintenance of our software over adding new features.
Every new feature must pass our CI process before it can be merged to our trunk - CI includes a build, deployment, and an automated test phase. The test phase utilises Cypress, Jest, and GitHub actions to help us identify and resolve issues in CI more quickly, so that we can ship and release more features, with greater confidence. Production deployments are automatically triggered by merges to the trunk.
Monitoring
To improve reliability and stability we have monitoring tools in place for our frontend application and backend services.
For our user-facing application, we need to be alerted to increases in script errors or degradations in performance so that we can provide the best user experience for our supporters. For this, we use CloudWatch alarms and third-party service to trigger alerts when certain CloudWatch metrics breach defined thresholds. We can also set up custom errors for specific failure points that we want to monitor more closely. We use AWS Chatbot to route alerts to Slack.
For monitoring application uptime, we use AWS CloudWatch Synthetic Canaries – these are scripts that run on a schedule to imitate a basic user journey through our application, monitoring the availability of our front end and endpoints.
CloudWatch Dashboards are used to graph standard and custom CloudWatch metrics emitted by utilised services such as API Gateway, Lambda, and SQS. CloudWatch Logs and Insights can be used to diagnose the root cause of a triggered Alarm.
We use AWS x-ray for tracing API Gateway requests and Lambda function invocations.
Where are we now?
Now that we’ve completed the migration to PWS, we can talk a bit more about what we’ve achieved.
We’ve met the outcomes we aimed to achieve with a cheaper, more modern, and more flexible online payments product that fits into our existing technology estate. Throughout the migration, we’ve made optimisations and improvements - not only to the platform itself, but also to our ways of working within the Technology team at CRUK.
Some benefits we see from this migration to PWS:
What’s next?
Now that we’ve completed the task of migrating and decommissioning a legacy product, we’re entering an exciting phase where we can keep looking forward. We’ll continue to make the product better for our supporters, looking for ways to reduce costs and finding areas of opportunity within the payments space.
If you fancy checking out the product (and maybe even making a very appreciated donation!) you can do so here: https://donate.cancerresearchuk.org/support-us/your-donation. Any feedback is welcome!
Freelance Test Consultant
2 年Well done all. Gina-Lee Morris I know this will mean a lot to you, good job!
Unit Manager Sunninghill hospital Orthopaedic
2 年Amazing well done