The GDS - What I talk about, when I talk about Digital - Part 2
Introduction
This is the second in my series on Digital Transformation, and probably the longest part as it takes a detailed look at The Government Digital Service (henceforth known as The GDS) and its Service Manual.
Disclaimer: As usual, this is my own opinion piece and is in no way linked to my employer. This episode of my Digital Transformation ramblings is probably going to be the most critical, if you’re averse to this maybe read something else, like Enid Blyton's "Five go to a Node.JS expo".
Before we start, if you’ve never seen any of the work of The GDS, you can see the home-page of the Service Manual here [https://www.gov.uk/service-manual]. It's surprisingly deep, with a lot of links off to other topics, it's clear the government have expended a lot of effort on it.
What is "The GDS"?
The GDS, is a unit of the Her Majesty's Government's Cabinet Office charged with “leading the digital transformation of government”. It came to be during a particularly fraught time during which, the government (and especially the NHS) were recovering from the bloody nose that was the £11bn (that is not a mistake, that does say 11 billion pounds) spend on Connecting for Health, the most expensive IT programme in the history of the world.
As a reaction to this, a new strategy was proposed in a report called 'Directgov 2010 and beyond: revolution not evolution' prepared by Martha Lane Fox, the founder of lastminute.com. She assembled a team of helpers, including Tom Loosemore (who ended up becoming a right-hand man of Mike Bracken and leaving with him to join The Coop). [https://tom.loosemore.com/]
They had some initial success, which gave them impetus:
“We run GOV.UK, the best place to find government services and information. It began as an alpha, built in just 10 weeks, and has grown to become part of our national digital infrastructure. It's always being improved in response to user research and user feedback.”
NB: Despite the “success story” above, the path to implementing gov.uk was difficult for those involved with it (knowing some of the people on the inside, it was considered to be “high-noon”).
Subsequent to this, The GDS as we now know it was formed in April 2011 to implement the 'Digital by Default' strategy [https://webarchive.nationalarchives.gov.uk/20160609173223/https://www.gov.uk/service-manual/digital-by-default-26-points], but only started in real earnest when in May of the same year, Mike Bracken was formally announced as the new Executive Director for Digital in Cabinet Office in the following statement:
“The role will combine the work of the Chief Executive of Directgov, the lead of cross-Government digital reform work and part of the work of the Director for Digital Engagement and Transparency. He will report to Ian Watmore, the Government’s Chief Operating Officer, be based in the Cabinet Office and will be responsible for over 100 staff in the Government Digital Service, of which Digital Engagement is part.”
In 2012, the GDS Design Principles were published, which gained a lot of attention, by seemingly being very progressive. [https://www.gov.uk/design-principles] and the 'Digital by Default' site became the 'Service Standard' [https://www.gov.uk/service-manual/service-standard]
“The Digital Service Standard provides the principles of building a good digital service. This manual explains what teams can do to build great digital services that will meet the standard.”
The upshot of the above is that ALL new government software projects must follow the principles, practices and standards laid out in the GDS Service Manual.
Examples of the Digital Standards
In order to understand what is expected of a delivery under the GDS Service Manual, I've included some examples and how they may be misused/blindly followed to produce a negative outcome. Don't get me wrong, they're all good principles and standards, but they often lack backing caveats to ensure people are keeping "honest" about how they deliver software.
“ Start with user needs
Service design starts with identifying user needs. If you don’t know what the user needs are, you won’t build the right thing. Do research, analyse data, talk to users. Don’t make assumptions. Have empathy for users, and remember that what they ask for isn’t always what they need”
The above is critical to Digital, but it often comes at the expense of overall project/programme vision of an end-goal. If you don’t know what your end state (or at least a mature product) looks like or when it’ll be there, you risk a meandering and expensive endeavour, or worse, just building a white elephant that nobody wants to use.
“ Iterate, then iterate again
The best way to build good services is to start small and iterate wildly. Release Minimum Viable Products early, test them with actual users, move from Alpha to Beta to Live adding features, deleting things that don’t work and making refinements based on feedback. Iteration reduces risk. It makes big failures unlikely and turns small failures into lessons. If a prototype isn’t working, don’t be afraid to scrap it and start again"
This is a modern way to deliver and most agile organisations would advocate the delivery of an MVP. But this way of delivering doesn’t explicitly consider the “risk-first” approach i.e. do the difficult architectural bits first. Releasing prototypes into alpha, beta and then into live, with the promise of later iteration is risky business. This is mainly due to business pressure to continue to deliver functionality. This “alpha” you developed to throw away will often become a support problem later on and won’t easily get redeveloped as the next thing will be on the horizon. The kind of prototyping encouraged should definitely not be considered an alpha. Arguably, the process is missing a stage, because the words alpha and beta have specific meanings in software delivery.
“ Be consistent, not uniform
We should use the same language and the same design patterns wherever possible. This helps people get familiar with our services, but when this isn’t possible we should make sure our approach is consistent.
This isn’t a straitjacket or a rule book. Every circumstance is different. When we find patterns that work we should share them, and talk about why we use them. But that shouldn’t stop us from improving or changing them in the future when we find better ways of doing things or the needs of users change.”
“ Make things open: it makes things better
We should share what we’re doing whenever we can. With colleagues, with users, with the world. Share code, share designs, share ideas, share intentions, share failures. The more eyes there are on a service the better it gets — howlers are spotted, better alternatives are pointed out, the bar is raised.
Much of what we’re doing is only possible because of open source code and the generosity of the web design community. We should pay that back.”
The above all makes some sense and the aspirations of Digital sound sensible. However, looking at some of the GDS principles, it’s clear to see how the banner of “Digital” is easily misinterpreted and/or misused.
"What to consider when choosing technology
When choosing technology, the most important thing is to make choices that allow you to:
* change your mind at a later stage
* adapt your technology as your understanding of how to meet user needs changes"
How often does a technology choice allow for a change of mind at a later stage and if it does, how much YAGNI (You Aint Gonna Need It!) effort went into this? Why not just do a bit more thinking (and technology diligence) up-front?
“use standard government technology components that are common across all services - read the Government as a Platform guide to learn more about this”
This directive seems to have been ignored entirely as the variety of technologies employed within the government runs the gamut of .NET, Java, Python, Ruby, Oracle, MySQL, SQL Server, PostgreSQL, MongoDB, Redis, Riak etc. to name just a handful of them. Please note, I'm not criticising any individual person, department or initiative, but there is too much technology diversity going on and that is expensive.
The GDS Assessment
It's inevitable that, at some point, your service will be assessed and this will be performed by one of the GDS assessors. I've seen several of these in motion and it's not always pretty. I've included a few of the criteria here to give you a flavour, but left a lot out for brevity.
Iterate and improve frequently – how you’ll be assessed:
- how you’re practising zero downtime deployments in a way that doesn’t stop users using the service
- how you plan to have enough staff to keep improving the service
The challenge on this one is, how many services need zero-downtime versus the costs involved in implementing it?
Evaluate Tools and Systems - How you’ll be assessed
To pass point 6 in the alpha assessment, you usually need to describe:
- the languages, frameworks and other technical choices you’ve made in alpha, and how this will affect the decisions you make in beta
- how you’ll monitor the status of your service
The rush to alpha and the fact that the monitoring requirements aren't well brought out in the service manual, resulting in services that have little or no logging or monitoring built in.
To pass point 6 in the beta assessment you usually need to explain:
- how you’re managing the limits placed on your service by the technology stack and development toolchain you’ve chosen
The lack of up-front NFRs means that "limits" may not be well understood because there's nothing to compare against
To pass point 6 in the live assessment, you usually need to:
- describe the tech stack changes you made during beta and why
- describe the development toolchain changes you’ve made during beta and why
- explain how you’re continuing to get value for money from the systems you chose and bought at beta
- explain or demonstrate how you’ll check if the service is healthy
- explain the support arrangements that you’ve set up for live
There's a lot there to put on a small beta team trying to move their service into live. It often stops the project in its tracks.
Make all new source code open
To pass, you usually need to:
- explain how you’re making new source code open and reusable
- show your code in an open internet source code repository
- describe how you accept contributions and comments on the code
- explain how you’re handling updates and bug fixes to the code
- explain the code you’ve not made open and why
The problems with the above are “new source code” – new features in a system are considered to be “new source code”, so existing systems are being forced down this route
- explain the licences you’re using to release code
- confirm that you own the intellectual property
This is an extra overhead that delivery teams traditionally would not be charged with
- explain how a team in another department can reuse your code
Not all code is destined to be reusable, why would this be forced as a quality?
How does it work in practice?
One of the telling quotes from Mike Bracken sums up the key challenge with the current GDS way of thinking:
“The people who can ‘find the quick do’ as one of my business cards says, would much rather actually deliver than try and influence policy makers. While many digital issues require clear policies, many more do not. What they require is very quick delivery of a working version of the product.”
This approach is fine for non-complex “website” deliveries, but applying it to something complex and enterprise scale (such as the Choose and Book replacement, the e-Referral Service) leads to a lot of “regret work” and possibly an entirely throwaway set of deliverables. Many users would take "right" over "delivered quickly", this is what we were told during the development of e-Referrals.
Some of the key challenges observed with the GDS “way”:
- It’s “assessed” and you pass or fail that assessment. This experience really depends on your assessor
- This happens after the fact, which is always more expensive to fix
- It presumes that “one size fits all”, including internal projects that are there to solve a business problem rather than provide a modern user experience
- It encourages departments to release a lot of “beta” software with no clear path to integration into a live service
- Support and maintenance are barely mentioned in the Service Manual, which results in a product that can only be supported by the delivery team, which is expensive post-implementation
- Asking existing projects to "open-source" when a) there's no real reusability and b) their code wasn't designed to be open-sourced just adds cost and pressure to the team
- It encourages the disintermediation of any kind of Architectural oversight
- It encourages fragmented services with no real cohesion, which looks unprofessional and is a poor user experience
But, the real problem is (and this is fundamental) that some parts of the government won't and can't transform internally, which is a prerequisite for Digital Transformation, because they are so huge and befuddling. Some departments have made it work (and we've seen the fruits of that, it's beautiful when it goes well), but I fear that the painful failures may well greatly outnumber the success stories. If the history of software delivery has taught us nothing else, is that it only takes a couple of high-profile failures for people to get the fear and throw the baby out with the bath-water.
A Message of Hope - How might it change for the better?
Luckily, the GDS is a living organisation who are making changes all the time. I'm confident that they, as industry thought leaders will be able to turn this around into something truly great. But how might they turn the GDS Service Manual into something more “helpful” for those implementing the guidelines? In my humble opinion, the following would be a good start:
- Strengthen the message on quality, all the way through the Service Manual
- Work on cohesion of services, take a step back and have a strategy for a service that ties it all together and ensure the team are aware!
- Get Architecture, Testing and Support involved early to ensure that the technology elements of the service make coherent sense
- Define NFRs early and ensure the tech-stack can meet them
- Consolidate technologies, reducing support and infrastructure costs
- Make the “assessment” more collaborative rather than punitive
- GDS could seed teams with delivery experts to ensure the right things are being done from the start
- Recognise that not ALL projects have to be delivered “user first” or indeed "Digital"
- Even those that do, should have a concrete business vision
- Allow for technology diversity, but ensure that there’s architectural oversight
- Engage Service/Support from the beginning of the project. Is this a service that can be supported by the existing team, what kind of upskilling is required?
End Transmission - Until Part 3
In the aftermath of people moving around between organisations as The GDS changed shape, The GDS Service Manual was replicated in part by The Coop [https://coop-design-manual.herokuapp.com/], which as Mike Bracken was involved, should come as no surprise.
What does the future hold for the Service Manual? I hope it survives, as much of it is excellent... But it also needs to evolve and improve to meet the needs of real delivery. It also needs some accompaniment in the form of education as people clearly do not understand its depth and subtlety.