DevOps -Business Transformation
Abhi Chaturvedi
Program | Senior Project Manager, Agile Practitioner |Transformation Management
1. Narrative
My journey to the cloud at scalenow.com.au began early 2005.The orders for garments, sports gears ,medical devices were delayed due to corruption of the data base resulting in loss of revenue for the second quarter. This is when I realiased, I have to transform my company. I have to move away from vertically-scaled single points of failure, like relational databases in the data center, towards highly reliable, horizontally-scalable, distributed systems in the cloud.My organisation was good at keeping the system running, and maintaining service level agreements, but lacked innovation and vision,very similar to the figure on the left.The resources were well equipped in getting the job done under any circumstances through thick and thin.
Jessica, one of my associate felt the managers were not engaged ," Hey Abhi, I personally feel Steve does not get the concept of innovation,we are living in an era of resilience.I am struggling to understand our vision and goals and ways of achieving them. I hear a lot of "ummms and mumms"and lots of Rhetorical statements " gasped Jessica
"It's not about the problem Jessica !!!!,Steve gets it !!! ,but he does not want to get involved in change and resolution.His expertise is to ensure the system is operational by putting bandage solutions. You need to demonstrate constraints and show him challenges,in order to receive any support of enhancement from him" puffed Abhi
2. Establishment
I thought of sharing a high-level understanding of a great collaborative culture to maximise flow of work, from business to the customer by creating fast, and consistent feedback loop.
The concept would aid in maintaining trust, collaboration, and learning amongst the team to promote the stability, reliability and upgradability of applications. The intent was to align development and operations environments with greater goals of business serviceability.
I went across to the kitchen wall and started to scribble my vision and associated factors. I could sense combination of signs of dismay, perturbation, signs of dyslexia, Friedman-Goodman syndrome,curiosity amongst the coffee goers as to the reason for scribbling on the wall.
Vision statement: "Make commerce better for everyone, so businesses can focus on what they do best: building and selling their products.".
I was joined by Jessica and her team members. We started to plot, end to process, making it visible to all stakeholders to drive central prioritization of work.
The intent was to perform gap analysis against "AS-Is and Target State" by optimising value stream to maximise flow – focusing both on quality, speed and to create a robust fast flow of value by creating a shared board giving operations and development visibility of flow of work in to production.
3. Create a dedicated transformation team :
We established shared goals on quality, availability, security and ensured responsibility lies with everyone in the development process. We then assigned dedicated resources to the value stream. The aim was to create cross functional self-managed, autonomous, empowered , long-lived teams that focus on the achievement of organisational and customer outcomes such as revenue, value and serviceability.
Our remit was to avoid splitting teams by function or by architectural layer – instead, structure teams around independent flow of value to the customer.
The team was responsible to test, ensure smooth running of operations and security services.
Work in progress limits were defined at each stage of the process to avoid multitasking which would reduce batch sizes by limiting the amount of in-flight work.
The long term intent was to reduce the number of hand offs by automating as much as possible in the development process –reorganizing developments teams to have all capabilities required to develop, test,release, and maintain their code in production
The challenge was to continually identify and remove most significant bottleneck impacting speed of delivery – creating change tolerant architectures and automation through development & release.
4. Architecture :
The next activity was to Architect for low risk releases and to enable productivity, testability, and safety by establishing a loosely-coupled architecture with well-defined interfaces. These components would enforce how services connect with one another. Decoupling services can be independently maintained and deployed – with no shared data structures, and clearly defined boundaries
Monolithic architectures are fine for early life companies, but may not scale, hence the team decided to establish loosely coupled architecture and adaptable design & strangler pattern.
The intent is design software with architecture, performance, stability, testability, configurability, and security prioritized into the work by injecting resilience patterns into daily work. Practice Relentless experimentation - testing the capacity/resilience of code by trying to break it & using the learnings to create anti fragile systems.
5. Modus Operandi
Our "Modus Operandi": Start with the most sympathetic & innovative people that already believe in DevOps, focusing on creating success with to build a coalition of change.
The most successful introductions of Transformation usually start small. Each success creates another group of passionate evangelists who can hardly wait to tell others in the organization about this new "way of working"
Our intention was to fix problems as they occur – and build a psychologically safe environment for people to raise concerns real time to enable organizational learning & safety culture.
Promote and adopt a generative culture where failure leads to inquiry, and information, including risks, is freely shared. Be adaptive in planning improvements, work in short iterations of change, measure outcomes, and incorporate past learnings in new initiatives
Established set of Principles to Follow
6. DevOps Environmental set up
We created shared services to increase developer productivity by creating a set of centralized platforms and tooling to enable dev–automated environments, testing, and common version control.
The creation of shared services laid to foundation of single repository of truth for the entire system, all application code, scripts, schemas, environment creation tools, containers, tests, and other technical artefacts in a common source control location.(As per figure Development and Operations to be friends as opposed to being Opponents)
The goal was to make infrastructure easier to rebuild than repair by establishing immutable infrastructure where manual changes to production are restricted. Created a single, shared source code repository by establishing a central shared source repository that stores all tools/libraries/infrastructure/config/source for deploying all environments.
The team ensured security of the application by including testing static & dynamic analysis in to testing, dependency scanning, and code integrity.( As per figure Security Calling, Trin!! Trin!! Trin!!)
The developers ensured security of software supply chain by ensuring all packages and dependencies used are up to date, and meet the same security tests required of platform as a whole.
The team confirmed security of the environment by establishing known good states of environments by automating the monitoring of all production instances against those good states.
7. Continuous Development
Continuous delivery is a software engineering approach in which teams produce software in short cycles, ensuring that the software can be reliably released at any time.
We then established non functional requirements, baseline for new services to achieve operational objectives.We to started to continuously build, test, and integrate, our code by automatically building and testing in a production like environment.
Further, instructed the team to adopt trunk-based development practices, i.e. developers to check their code to trunk at least once per day to limit the batch size of changes.
The target was to built a culture of cohesiveness and even asked developers to directly observe the UX of their software on real users and understand, challenges users face.
The aim was to promote a learning culture that embraces failure as a trigger for inquiry and learning and not of scapegoating and blame.We started to promote pair programming to improve changes, spread knowledge, develop in small testable batches through pair programming, and practices like Test Driven Development / Behavior Driven Development.
This lead to cutting bureaucratic processes, relentlessly leading to reduction in effort required for engineers to perform work and deliver it to the customer with light controls, and high automation.( As per figure Keep on Developing)
The team started to catch errors in automated testing by establishing and detecting issues as early and as fast as possible, (ie. Unit tests) by building fast and reliable automated validation test suite and Automating all layers of testing .
The team ensured tests ran quickly (in parallel, if necessary) by automating the commencement and running of tests (from source check-in),rather than waiting for manual approval or trigger from developers. The necessity was to automate as many manual tests as possible,the team started with a set of automated and fully reliable tests, adding iteratively automating all steps across the deployment processes to enable low risk release.
The developers created a code promotion process to be performed by Dev or Ops without manual intervention to build, test, and deploy the software
8. Continuous Integrate
Continuous Integration (CI) is a development practice where developers integrate code into a shared repository frequently, preferably several times a day. Each integration can then be verified by an automated build and automated tests.
The goal of Continuous Integration is to provide rapid feedback so that a defect can be remediated and corrected as soon as possible.Continuous Integration software tools can be used to automate the testing and build a document trail
Living up to the expectation to the notion of Continuous Integration, the team integrated performance testing in to the test suite by writing automated performance tests, validating across the entire application stack as part of the deployment pipeline.
Tests included validation of system attributes ( Integration of non-functional requirements) supported applications, compilers, OS, and any other dependencies.
One of the developers suggested to integrate A/B Testing into Our Daily Work. Use the feature hypothesis: Believe (action), will result in (result), will have confidence to proceed when see (measure). Release two version of our feature/ product, diverting a number users to the control (“A”) and the rest to (“B”) – applying statistical analysis of results.
The team ensured all code is reviewed prior to release – keeping the size of changes small to streamline review & release practices and to automate and integrate testing into daily work, ensuring a flow of changes into production with high release frequency.
Often security is over looked and neglected and separately handled.The team ensured and integrated security into development iteration, and into the acceptance criteria and Definition of Done for user stories.
The developers integrated security controls into source code and services and centralised a set of pre-validated security blessed libraries, maintained and pulled in real-time during the CI/CD pipeline.
The team further integrated security into deployment pipeline and created security tests that run as part of the deployment pipeline for every committed change.
9.Continuous Deployment
Continuous deployment is a strategy for software releases wherein any code commit that passes the automated testing phase is automatically released into the production environment, making changes that are visible to the software's users.
The architecture was based on decoupling services, which can be independently maintained and deployed.The architected components were designed to be decoupled from releases and adopted environment based or application based release patterns to decouple deployment from customer release.
The team leveraged patterns to improve speed and ease of deployment by implementing feature toggles or dark launches to control visibility of changes.
To minimise the risk failure of deployment ,the team used telemetry to make deployments safer by actively monitoring the metrics associated with feature during deployment.
The developers created security telemetry in applications to identify insecure practices or behaviours in system operations – and flags appropriate alert levels. Establish telemetry into environments to monitor changes to OS, security, config, infrastructure server errors.
The team further protected the deployment pipeline by hardening Continuous Integration, Deployment process by reviewing all changes in version control, detecting suspicious API calls, isolating Continuous Integration dead processes.
In terms of protecting the deployment pipeline integrated security and compliance into change approval, leveraged ITIL’s standard/normal/urgent change classifications and incorporated security assessment into those to meet compliance needs
10. Monitoring
The monitoring team created centralized telemetry infrastructure by Centralize logging, transformed the logging into valuable metrics, then applyed statistical analysis to identify patterns to trigger actionable events. ( As per figure Monitoring)
They further introduced application logging telemetry and ensured every feature provided telemetry, and created logging hierarchies for both non-functional and feature attributes.
Telemetry was used to guide problem solving by establishing fact based problem solving - using the scientific method to create and test hypothesis to obtain learning.
Telemetry is the collection of measurements or other data at remote points and their automatic transmission to receiving equipment for monitoring.
Telemetry helped in enabling creation of production metrics as part of daily work and created central and easy to use infrastructure and libraries for easy development & operations to create telemetry for all new functionality.
The technocrats created telemetry at all levels of the application stack, for all environments, and throughout the entire deployment pipeline.The team started to analyse telemetry to anticipate problems and used mean and standard deviations to detect problems.
The agreements amongst teams members were to schedule blameless postmortem meetings after accidents by bringing all stakeholders together to understand the timeline of events, identify root cause, identifying blameless learnings. All team members access were enabled to self-service telemetry and information radiators by providing access to production telemetry.
The team published post-mortems as widely as possible to make the findings and actions of post-mortems transparent all the way through to the customer, if possible. The goal was to spread the knowledge, so others can learn from it.
The bottom line was to redefine failure and encourage calculated risk-taking by failing faster and more often, identifying it as a learning opportunity and applying the necessary correction to prevent recurrence.
11 .Convert Local Discoveries into Global Improvements
Most successful introductions of transformation usually start small. New "ways of working" spreads to another function, with the original practitioners acting as coaches. Each success creates another group of passionate evangelists who can hardly wait to tell others in the organization about this new way of working.
The team spearheaded the concept by spreading knowledge through documentation and Communities of Practice and by developing tests that are self documenting of the code – showing engineers working examples of how to use the system.
Started to Share experiences from conference by applying and experimenting with learnings obtained from conferences –fostering the relationships build for continuous learning from peers. Members dedicated regular time for learning and teaching – being committed to prevent it being deprioritized for other operational work
12. Conclusions
Transformation is about improving performance, not just cutting costs. Companies boost the odds of achieving breakthrough results when they simultaneously improve their operating discipline and make portfolio moves that collectively redefine their business.
Digital transformation worked for the organization because the leaders went back to the fundamentals: they focused on changing the mindset by being collaborative, suggested improvement in organizational culture and processes before they decide what digital tools to use and how to use them. What the members envision to be the future of the organization drove the technology, not the other way around.
Disclaimer: the opinions and perspectives shared in this article are that of the author with references mentioned below , Abhi, Jessica, Steve are fictional characters and have no resemblance to any body living or dead.
The two-day, interactive course helps people across technical, non-technical, and leadership roles work together to optimize their value stream from end to end. Attendees will learn what DevOps is, why it is important to every role, and design a continuous delivery pipeline that is tailored to their business. Attendees work in cross-functional teams to map their current state value stream from concept to cash, identify major bottlenecks to flow, and build an actionable implementation plan that will accelerate the benefits of DevOps in their organization.
For more information or to register for an in-house workshop, contact Abhi Chaturvedi on 0422 149 614, [email protected] or visit www.scalenow.com.au
13. References and Acknowledgement
- Tribute to the ‘The DevOps Handbook` published by Kim. G, Humble. J, Debois. P, Willis. J (2016), It Revolution Press1.0
- The DevOPS Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations by by Gene Kim (Author), Patrick Debois (Author), Professor John Willis (Author), Jez Humble (Author), John Allspaw (Foreword)
- Transformation Insights: The CFO’s role in helping companies navigate the coronavirus crisis. Article by McKinsey & Company
- Digital Transformation Is Not About Technology ,Harvard Business Review
- Pictures Credit: Gwotyng's Italian Monastary(https://steamcommunity.com), cartoon stock, cartoon collections, agility health radar, me.me,Tom and Jerry (amazon), smartbear, alamy, DevOps radar
Program Delivery Consultant | Driving Profitability| Innovation | Process Improvement| Data Insights |
2 年Thanks for sharing this article Abhi Chaturvedi
Program | Senior Project Manager, Agile Practitioner |Transformation Management
4 年Surosh Sabeti FYI
Program | Senior Project Manager, Agile Practitioner |Transformation Management
4 年Alison Evans,thanks for the heads up, added the cheap thrills IT video in the article,it rymes well with DevoOps
Agile transformation | coach | Trainer | Facilitator | Technology & Capability
4 年Very well articulated
Senior AVP - Transformational BI & Generative AI Leader @ EXL | 2024 3AI Pinnacle Award for Inspiring Women Leader | 2024 India EmpowerHER access Role Model by Women in Cloud | 2023 Role Model by Women in Cloud | Speaker
4 年Nice way of illustrating the nuances of transformation