The Importance of Commitment

I have been a part of many attempts at The Big Rewrite over the course of my career. Some were emergencies, reacting to some change or circumstances that rendered the old system unusable. Some have been out of recognition that an emergency was foreseeable and therefore, presumably, avoidable. Once was even to re-start a failed Big Rewrite project, in the hopes that bringing in a contract company would be the secret sauce needed to complete the project. It wasn’t. In fact, with one exception, every one of these projects was a complete failure. The results of those failures ranged from a severe scaling back of the enterprise to outright failure of the company. I’d often wondered about the failure to execute when so much was at stake, but it wasn’t until the one time I saw success in The Big Rewrite that I saw the x-factor, the secret sauce one enterprise had been looking for, in a story a friend of mine once told me.

This friend, Tom, was a business consultant at the time, specializing in closing very large deals and “doubling your double”, that is “doubling the rate at which you double your revenue”.? He knew and socialized with a lot of big names in the business community, and in the course of a conversation (or rather a series of mini-conversations as Tom was never able to talk about only one thing at a time) he told me an interesting little tidbit about a business owner named Red Adair. Red Adair was an oil well firefighter and ran a company responsible for capping some of the biggest and most dangerous oil well blowouts in the world. If you owned an oil field, Red Adair was on your speed dial.

According to Tom, however, Adair had a very direct and blunt way of dealing with people. If a customer attempted to contract Adair’s company and at all argued with Adair’s methods or pricing, he would tell them something that I didn’t understand until I started seeing Big Rewrites fail.

“You don’t understand the scope of your problem and you aren’t committed to solving it. Call me back when you’re ready to get serious.”

And then he’d hang up. Conversation over. Red’s work was extraordinarily difficult and even more dangerous. He was one of the few people in the world who had a process for properly minimizing the risk and charged enough to make it worthwhile to his employees. Or, barring that, his employees’ survivors. Mess with the process and it gets too dangerous. Mess with the money and he can’t properly pay his firejumpers. But more importantly, the desire to bargain and control the process showed that the customer didn’t really understand the scope of their problem or the difficulty of the solution. An oil well blowout, that is when an oil well is uncontrollably pumping burning oil out of the ground, will burn until the well is dry. That can take years. In other words, the problem is not going away on its own. And the consequences of ignoring the problem, or even delaying the solution, are disastrous for the environment and your company. You will get fined. You will damage the environment. You will eventually get shut down. And even after you’re shut down, you can still rack up penalties. To sum it up, when you have a burning oil well you have two outcomes. Do whatever it takes to solve the problem quickly or lose everything. Customers who thought there was a third option inevitably failed.

In software development, our problems are rarely so dire. However, our problems often fit that mold and so we can learn from Red Adair’s insistence on understanding the scope of the problem and fully committing to the solution.

Understanding the Scope of the Problem

Regardless of the final architecture, e.g., microservices, async messaging, or simply tackling a POTS upgrade made difficult by custom modifications, it is always a bad idea to take on The Big Rewrite lightly. Both the risk and cost of failure is high, entirely too high to engage in this sort of project lightly. And “Because we want modern development practices” is not a sufficient motivation, despite the number of times I’ve heard precisely this reason given for completely changing a software stack and development practices. So let me tell you right now, being modern is overrated. So are “best practices”. Blindly adopting modern development practices is nothing more than chasing trends and best practices are just solutions to problems you don’t have. Neither should be a reason for making small, insignificant, changes much less expansive and risky changes.

However, when (not “if”, but “when”) you hear these tropes, dare I say cliches, given as reasons for The Big Rewrite it’s important to understand that this is probably not the ultimate goal of the customer. And while it’s one thing to end a conversation with a customer who doesn’t fully understand the level of problem presented by an oil well fire, it is up to us as consultants to both understand what the customer is really telling us while helping the customer better understand what they’re trying to communicate. When a customer tells you that they want to “use best practices” or “adopt modern development practices”, they’re attempting to communicate a problem that they may not fully understand and may not have the vocabulary to articulate. It’s our job to dig deeper, and more importantly, it’s our job to help the customer dig deeper. To fully understand the scope of the problem.

So, what are customers really trying to tell us? Oftentimes, what they’re really telling you is “Our processes are unmanageable, we’re not able to execute because everything takes too long to do, our software is slow and has too many flaws we can’t fix, and this is threatening our ability to continue supporting our business.” This is a hard thing to admit, especially when you’re talking about something you’ve built. The customer thought they were doing everything right and yet so much has gone wrong that they’ve called in a specialist to help, but it’s hard to admit that. Human nature, if nothing else, makes this difficult. So customers, who in and of themselves are likely pretty good at software development, propose a solution. Solutions are easy to talk about, especially if you don’t see that the mess you’re in exists because you created it. And yet, these quagmires don’t just appear. They aren’t naturally forming, and software practices aren’t something that just naturally degrade over time. These quagmires exist because people build them, one handful of mud at a time, thinking that they’re doing the right thing. And since doing the right thing shouldn’t lead to these kind of problems (that’s the way the world works, right?), it’s hard to connect the problems of the large enterprise development shop to the way they develop software.

I called out large enterprises because not only are these enterprises the places where these problems are more pronounced, but also where it’s easiest to draw the lines between practices and undesirable results. Large enterprises have two major problems when it comes to building solid, useable, development practices. The first is a problem with perspective. Large insurance companies see themselves as insurance companies. Large banks see themselves as financial institutions. Large retail companies see themselves as retail companies. However, the hard fact of the matter is that every company that builds its own platform is a software company and needs to see itself as such. The distinction is subtle but important. “We sell insurance and write software to support that” is a different focus than “We create and support an insurance sales platform and use it to sell insurance”. Inverting the area, you see as your main competency puts focus on building a solid, useable, platform. The effect of this focus can be best seen by comparing a typical large enterprise to a typical small startup. A startup will ask themselves, and decide among themselves, how to approach problems such as architecture, code standards, and devops practices, and create solutions that work best for them. They are willing to adjust on the fly and they are willing to be wrong. Large enterprises, on the other hand, implement practices that other large enterprises implement. “Best practices”, again nothing more than solutions to problems they don’t have, decide how to build development practices. Standardization is a greater focus than deliverability, and teams are expected to do what everyone else does. In other words, startups are willing to accept risk and enterprises are not. The problem is that since large enterprises tend to adopt the practices of other large enterprises, they tend to end up with the same problems as other large enterprises. And this is the second problem that large enterprises face. They do the same things everyone else does without realizing that they will end up with the same problems everyone else has.

This is what customers mean when they say, “we want to adopt modern practices”. They have problems but likely don’t understand the cause of those problems, or how to fix those problems, or even that their “Best Practice First” approach is the problem that needs fixing. But a customer who does not understand this does not fully understand the scope of their problem and will likely fail in The Big Rewrite. Which means, as consultants it’s our job to help customers understand. And to do that, it’s important to remember that you can’t get your customer to understand the problem by explaining it to them. You need to show them. Show them how fast a simple GitHub deployment can be, and when they point out that it’s easy to do a safe deployment on a simple application agree with them. Agree that when it comes to deployments simpler equals safer, and then go on to tell them that the best way of handling the risk of deploying new features isn’t to mitigate the risks but rather to simplify the deployment so that there is less risk to mitigate. Show them how you can deploy changes to a microservice without risking downtime to consuming clients, and then show them how quickly and easily you can rollback that change through deployment slots.

A burning oil well has the advantage of being an obvious deviation from the norm. Oil wells aren’t supposed to be on fire, it’s pretty obvious when one is burning, and understanding the problem is not difficult. However, most software development ecosystems, that is a combination of practices and the code base that comes from those practices, are already a burning oil well. We just don’t see it because the burning oil well is the only environment we’ve known. Don’t talk about the problem until you’ve shown them what life is like when the fire is out.

Committing to the Solution

Understanding the problem is only part of the reason why most Big Rewrites fail. Once you understand the scope of your problem, you still have to do the work to fix it. That’s no small task, assuming that the organization isn’t willing to suspend business for a couple of years while you sort out their software systems. Tearing down a monolith and re-implementing it as a service-oriented architecture is an iterative process. You identify a set of candidates for initial refactoring, preferably something easy and non-essential so you have a wide margin of error. You identify use cases. You break down the domain. You identify your service boundaries, create your aggregates, deploy your new database(s), deploy the new service, verify that it works correctly, migrate your data, and finally you have a microservice. And now the hard part begins.

Microservices are pretty worthless if they aren’t used, but they’re only slightly more useful when viewed as part of an application. They aren’t. They are a cross-cutting system that can, and should, be made available to multiple applications. You don’t have one Email service per application, you have one for the enterprise. You don’t have one Tax Calculation service per application, you have one for the enterprise. Which means that when you deploy a new microservice, everyone else who performs the function that the new microservice handles needs to refactor and use the new service. Otherwise, you don’t get the benefit of the service. Unfortunately, this is one of those “Simple But Difficult” types of problems. Committing to a constant Test-Refactor-Test cycle that will take years to complete requires an insanely high level of organizational discipline. Certainly a level of discipline that you will not find in a company that doesn’t understand the scope of their problem, which is why that is so critical. Creating an enterprise service layer will eventually require work from functional areas other than the team developing the services. And since these other teams will be busy with their own work, delivering their own commitments on their own schedule, getting the necessary work of integrating the new microservice(s) into their application or system requires a certain amount of deftness, negotiation, and preferably a mandate and solid backing from IT leadership. Not just in the beginning when microservices are a fresh idea and the whole effort still has that New Project excitement going for it, but also at year two when you’re coming back to functional areas for the fourth or fifth time asking for a refactor and test cycle. Or when the CEO, who probably doesn’t know anything at all about software development since the company doesn’t see itself as a software development company, asks why this project is in year three and shouldn’t it be done by now.

To do this, you must set expectations ahead of time. Be honest and don’t sugar coat it. The temptation is to minimize how much effort it sounds like this will take, but don’t give in to that temptation. Customers who understand the scope of their problem tend to appreciate the honesty of telling them how much they’re going to need to commit to the solution. If my relationship with the customer seems to allow for it, I’ll even make a little joke of it. “Just so you don’t think I’m selling you a bottle of Snake Oil that will cure all your ills overnight, here’s what this is going to take.” Or, again depending on the relationship with the customer, I just lay it on the line for them. “This will be long. This will be hard. This will absolutely be worth it.”

That said, commitment will wane, and even die out, without results. When I talk with customers on where to start their microservices journey I tell them to start with something “Easy but Meaningful”. Start with something that’s well understood and lends itself well to an asynchronous “fire and forget” process, but make sure it’s a process that means something to the customer. If you’re lucky, you can find a pain point, a slow running process that can easily be carved from the whole. Work with them through breaking out that one process into its own service and let them see the benefit. Then do it again. Answer questions but reinforce their learning process. Then slowly begin answering questions with “Well, what do you think we should do” and turn their question into a discussion among peers. Commitment will build as the customer begins to see themselves as experts and as they begin to see how much easier their lives get.

Summary: The “Red Adair” Factor in Four Steps

Boiling all this down, then, we get the following four components that must be in place to succeed in The Big Rewrite.

A Critical Problem

A mandate from the CTO is not enough of a spur to succeed in this kind of undertaking, nor is the desire to have modern development practices. You need a problem that is roughly equal in scope to the difficulty and complexity of the solution. Anything less is a bad sign.

Customer Understanding the Scope of the Problem

It’s not enough to have a problem. The customer must understand that their big ball of mud is a sufficient problem to end their ability to support business processes. It’s not enough to recognize a problem. The customer needs to recognize the severity of the problem.

Committing to the Solution

Claiming process or procedural reasons why necessary rewrite activities can’t be performed, e.g., having to run everything through a legal department or IT policies that cannot be updated to reflect changing needs, are generally a sign of not committing to fixing a problem. So is the ubiquitous claim of not being able to rewrite systems because of the amount of effort put towards supporting the old system. Committing to the solution means doing the work because you know that no matter how hard the work gets, not doing the work will be worse.

Customer Education

It’s not enough to provide expertise. Customer commitment will strengthen and reinforce itself as the customer becomes more skilled in service-oriented architecture, development, and devops. It’s as important to show the customer how to build microservices as it is to build them. The old saw about teaching a person how to fish needs to be your North Star here.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了