Unleash your technologists!
In a previous article, I talked about how we need to come together as a community and Make Banking Technology Great Again. However, this was all from a very high level, and so now I want to give some specific recommendations and recommend some books to read. The world of technology moves too quickly, and there’s just too much to know - so this is a summary for a great developer who has been away from the state of the art for some time. I’m sure I’m missing lots of great advice, so please let me know in the comments!
The agile mind
Most organisations are trying to move to an ‘Agile’ model, but I feel some of the original motivations that gave birth to this movement have been forgotten! Everyone involved in the Agile transformation (including business stakeholders!) should read and remember the Agile Manifesto, especially the Principles. Your company should adopt whatever model works for you, but never ignore the inspiration of the founding text.
The most important lesson in Agile is that it is essential to tighten up the feedback loop with your customer (internal or external) so that technologists are working on the right things. While I have known this to be true on an intellectual level since I started working, it is only recently with experience that I have truly begun to understand this. Don’t fall into the trap of reciting the mantra without actually living it.
Make this loop as tight as possible, and make sure the business is involved from day one
As technology managers, we should be relentlessly identifying and eliminating constraints in the delivery process. A book that discusses this in a lively and fun format is The Phoenix Project - a DevOps book masquerading as a novel about the project from hell. A constraint to particularly focus on is manual intervention - not only is it a waste of time, it introduces a potential source of error. This includes rubber-stamp approvals - if a manager has to click a button for all changes, irrespective of risk, then something is wrong with the process! This should of course be balanced against automating tasks which are hardly ever run, as per the excellent XKCD comics here and here.
I often hear that getting business sponsorship and funding for cleaning up technical debt and improving automation can be challenging, as it is difficult to quantify the business benefit in dollar terms. I’m sure there is truth in this, but I also believe that most sponsors can be convinced by a well-articulated argument that explains why technical debt is ultimately slowing down technology’s ability to deliver.
Why do I trust thee? Through the fiery crucible of testing!
One of the most memorable points about testing I’ve ever read came from Working Effectively With Legacy Code. Michael Feathers asserts that legacy code is code without tests, not ‘old’ code - which is how most people think about legacy. Good unit tests are essentially living, breathing documentation.
A question I love to ask in interviews is “If for some reason you had to choose between keeping your application code and your unit tests, which would you choose?”. To me, the answer is obviously the unit tests - you’re essentially being given ‘the answer book’, and you know you have recreated the application once all tests pass. To regain confidence in your application by rewriting all the tests would require huge amounts of effort going through every line of application code and reverse engineering it.
Save the tests!
Unit tests are the source of code confidence, but how do we know our unit tests are good? A decent proxy for this is ‘code coverage’. Code coverage tools automatically determine how much code is being executed by your various tests. There is no point in having great tests that only cover a specific route through the code!
I’m not a robot ?- User Testing
While we build confidence through automated testing, it is inevitable that humans need to test new applications as well. I believe that dedicated internal testing teams are unfortunately ineffective - any benefit from centralising testing expertise is lost by that team not having good context of the true business needs. Instead, testing should be made an integral part of the delivery team, and all their findings converted into automated tests for the future.
Business users should be involved as early as possible in all aspects of delivery, including testing. However, this does not mean a lengthy ‘UAT’ phase in the project plan; securing key business resources for a long time is difficult, and commonly it’s too late by then anyway. Instead, business users should be continuously involved throughout.
When features are requested, break up your ticketing system (for example Jira) so that there is one ticket equal to one UAT sign-off. While there may be sub-tickets below that one for internal purposes, that ticket is the one the business should understand. It should use business language, and demonstrate a clear business outcome - even if it is a purely technical objective, a business user should understand it.
The ticketing reference should also be used to drive other processes and integrated as tightly as possible. For example, since checkins can be tied to a ticket, which can then be tied to a sign-off, the entire code and delivery audit process can be automated. Access to these processes, systems and reports should be as transparent as possible - everyone in the company should know where to go to see what has recently changed.
YAAAAASSS!!! What an awesome ticketing system!
Pull requests should be merged to lower environments, then merged up, to eliminate regression errors where code is released which accidentally removes previously released functionality or fixes. Occasionally people think it’s a good idea to release directly into a higher environment such as UAT, to fix a bug quickly. This then relies on the team remembering to properly merge to lower environments. You only need to forget once or twice - where previously fixed bugs get reintroduced - to break the trust of your UAT testers.
Over time, through better processes and better testing, you and your business will get more comfortable with your releases. At this point, begin to break the deployment up into smaller and smaller chunks which can be quickly and independently released. Ideally, a single app could be deployed into SIT/UAT (and even production!) multiple times a day if needed. For many enterprises, even monthly releases are challenging - but it doesn’t have to be that way, as shown by Amazon and Facebook. For more on this, read Continuous Delivery.
My final point on user testing is regarding UX. The first time I watched some investment banking software I had written being properly used, I was amazed. The users were using it totally differently to how I thought they used it, and had also written an amazing (and terrifying) VBA macro that logged into my web application and automated a lot of their work. If they had told me what they were doing, I could have baked that into the application… but the fault lay more with me. I should have worked more closely with them, and watched how they were using the software earlier. Many of these themes - and many others - are picked up in Don't Make Me Think. All developers should be forced to watch in silence as new users try to navigate their way around an application.
A process, a process, my kingdom for a process
I will now layout the foundation for what I think is a great development process. There are many different ways to build a successful process - if you have something that is (truly) working, carry on using it! Bear in mind that good developers will build good code, and mediocre developers will build crappy code whatever the process. Hiring good people and empowering them is better than any process.
I’m not going to extol the virtues of source control, as we live in the 21st century and we’re not savages. That said: no, a backed-up shared drive is not source control - I’m always surprised that this one still comes up from time to time. So let’s assume we’re all using a modern source control system like git, mercurial or subversion.
How does code get “allowed” into the shared development branch? This should always be via a code review or pull request. No exceptions. The code reviewer should be enforcing the following standards:
- 100% unit tests pass (or an extremely convincing explanation why not).
- 80% (at least!) code coverage for unit tests. If adding this process to a legacy code base, 80% is impractical - in which case at least 80% of new code must be covered.
- In-line document comments so auto-generation tools can build documentation (e.g. docstrings/Sphinx in Python, Javadoc in Java, etc).
- Functional documentation in your Wiki, or however you store your documentation.
It is important to build a positive culture around the reviews - there is no ‘shame’ in having code rejected, it’s just a necessary part of good code hygiene and a learning opportunity. In the past I’ve seen cases where people always accepted code - then gave the feedback offline. While this saves a minor amount of face for the reviewee, it means there is less-than-ideal code sitting in the repository, which doesn’t help anyone.
As an incentive for the more experienced developers to perform reviews, publish statistics around who is doing code reviews and try to gamify the process a little. A leaderboard both appeals to the competitive spirit within us, plus can highlight the senior developers who are not collaborating.
A helping hand
The semi-recent focus on DevOps has produced a plethora of tools and services that further make life easy for the enterprise trying to modernise their approach to technology and increase developer productivity. Many of the services I introduce below are from AWS, as this is the provider I know best, but most of them have alternatives.
Having a good handle on how your infrastructure is constructed is essential, and yet too often we allow it to grow organically. Creating a production-like environment for testing becomes a huge overhead, and some organisations have teams devoted to just this task. In 2018 this is unacceptable, and all teams should be using CloudFormation (or equivalents such as Terraform) for defining infrastructure as code.
Once the infrastructure is described as code, it is easy to setup and tear-down environments. When coupled with cloud computing, there should be no reason not to give application teams as many environments as they need. The description files should be source-controlled, which then also gives a history of who changed what and when, and gives an easy method for diffing the environment between points in time.
Code should never be manually deployed, and the automated testing mentioned earlier should be initiated whenever code is being checked in or moved between source control branches. To coordinate your workflow, consider AWS CodePipeline (or equivalents such as Jenkins). The actual deployments themselves can be handled by AWS CodeDeploy (or equivalents such as Ansible) - these tools give you additional insight into the status of your deployments, and allow you to easily do things like rolling deployments.
Whoever possible, it is best to use so called ‘serverless’ approaches to deployment. I will not go into more detail here as it is well-documented elsewhere, other than to say that the lower the amount of infrastructure that needs to be managed, the better. It is hard to estimate the cost of humans managing commodity infrastructure in the enterprise, and estimates almost always ignore the cost of hiring and training good staff, never mind the salaries of the HR, finance and management divisions that watch over them.
If serverless is not an option for some reason, use containerization wherever sensible. This is also a good step in the journey towards microservices. Containers allow you to package applications up into an easily deployable unit, which gives you confidence that the deployment will work anywhere.
Initially, companies ran their containers with just Docker, however as deployments got more and more complex, there was a need for a higher level orchestration service. Recently, there has been a trend towards using Kubernetes for this purpose, or hosted versions of Kubernetes such as AWS EKS. Note that these typically use standard Docker containers underneath, so a great starting point is getting comfortable with Docker.
Wonderfully standardised containers
Don’t call us, we’ll call you
Any programmer will tell you that it is more efficient to have events pushed to a consumer rather than having it constantly poll, looking for something interesting. That same principle is just as true for the infrastructure and application environment - time spent looking for configuration issues and compliance breaches is time that is better spent elsewhere. Instead, the architecture should inform us when something is amiss.
Two cloud services which help keep everything running smoothly and in-compliance are AWS Config and GuardDuty – these are two of my favourite services for the banking domain.
For AWS Config, you take all the rules you have built up around compliance and internal controls, and turn them into automated tests which run against your infrastructure. For example, if someone spins up a server which doesn’t have an encrypted disk, AWS Config automatically shuts it down and alerts your ops team. Two sets of sample rules are available here and here.
GuardDuty is an automated threat detection and protection service. It builds a baseline of what ‘normal’ activity looks like, then any time there is an aberration it sends out an alert or prevents the action from happening. Some of these responses may be the same for everyone - for example, it blocks traffic from known bad IPs - but some may be unique to your particular account - for example, an instance type is started that you have never used before.
By utilising approaches like those mentioned above, technical staff can be freed up to concentrate on business deliverables.
Conclusion
The main business benefits and outcomes of what I describe above is flexibility, time to market, and tightening up the delivery cycle. I am a strong believer that empowered developers, agile processes, a strong devops focus, and quick-to-deploy cloud services are the best way to build that tight loop.
Good luck improving the way you build things, and please do give me additional suggestions in the comments below. We’re all constantly learning!