Platform Engineering Landmines - Part 1

Platform Engineering Landmines - Part 1

(I’d like to thank Yev Spektor , Brock Reiman and Claudio Masolo for contributing to the article discussion)

As engineering leaders contemplate internal developer platforms (IDPs), many are unaware of the organizational and cultural "fine print" that comes with them.

After 15 years building platform teams, and interviewing countless peers about developer platforms, I decided to share a few of the stories on patterns and themes related to the unforeseen costs of IDPs. Centralizing platform functions brings about a new level of organizational dynamics that most orgs aren’t equipped to tackle from the get go. The stories I’ll share are based on interviews with leaders that faced those challenges in real time and were willing to share them with the broader community.

Few industry leaders seem to talk about these realities. The IDP hype suggests it's a silver bullet - an inevitable evolution for infrastructure teams. But in my experience, the transition is far more complex.

My hope is that sharing these stories will help you prepare tech orgs for the cultural and organizational costs so you can pave a smoother path to internal platform success.

“I was hired to lead”

In one of the interviews with a CTO from a leading mid-sized tech company with over 500 engineers, they shared the story that colored their platform engineering journey the most. In preparation for scale, they prioritized recruiting technical visionaries, with proven track records from past companies. However, in their eagerness to scale, they overlooked a critical factor: the need for empathy in technical design. Failing to account for and accepting the constraints and realities of the application developers proved pivotal to the platform's success and failure.

In one interview with Alex, an engineering manager in the infrastructure-platform group at the time, he described the buzz in the platform team after they hired a visionary tech lead who had come from a hot container startup. The new lead's big ideas and confidence about the future of cloud computing got the platform engineers excited, and Alex admitted sharing in their enthusiasm at first.

Over the next several weeks, Alex had multiple conversations with the new tech lead, especially around the tech lead’s thoughts that the product teams should fully commit to adopting the platform team's tools and approaches. 'It'd be better if they fully committed to using them,' he would focus a lot on the politics and how the product teams were stuck in the old ways of doing things.

Over months, tensions built up between the tech lead's stance and product teams pushing back. The product teams highlighted both technical and cultural constraints that didn't align well with the tech lead's vision. The developers just didn’t work that way, and expressed that they didn’t have the mental capacity to both build their product quickly, and also re-shape how they did development.? But the tech lead insisted, convinced that their resistance was merely an unwillingness to do the work and learn something new. Eventually the CTO caved and mandated adoption, forcing the product engineering groups to use the new platform for all new services.

After one too many fiery meetings, the product org leadership got involved and made their stance clear: If the tech lead were not to budge, the product groups will not adopt the platform and will start looking at self-funding their own platform efforts. At this stage the platform tech lead was let go, and his refusal to adjust his worldview and listen to the product teams was the final straw. At that point, Alex wasn't sure if the platform org would survive.

In a last-ditch effort, Alex and the platform team formed a 20 person strike team that embedded directly with one of the product groups. The product groups’ condition was that the platform team would join their daily rituals, co-design tools fitting their workflows, and align as mutually invested partners. But the decision came with a price.?

The platform team's morale suffered, as those working on the existing platform felt left out from new design decisions. And those focusing on the new platform felt the rest of their teammates were not aware of or embracing the future needs of the company.

As Alex reflects now, bridging disparate worlds is messy and difficult, but the hard-won empathy and insight gained made all the difference. Trust was reestablished between the teams, setting an example within the company of how the platform team cares and empathizes with its users, going the extra mile. However, it also set a dangerous precedent. Leadership had to manage expectations and make clear to all teams that a similar embed would not be possible at that scale again.

It’s easy for platform teams to lose sight of the problems app developers face day-to-day. By immersing in the product team's reality, Alex believes they not only built tools that empowered but learned to lead with compassion. The lessons were painful but clear - vision must align with reality, and empathy must guide innovation.

vision must align with reality, and empathy must guide innovation

You may own it, but they dictate it

One head of platform shared the following story highlighting the perceived vs. real relationship of platform SDK ownership and how even well-intentioned customization can bring unintended consequences of fragmentation and complexity across technical platforms and organizations.

Company Profile: 450 backend engineers out of an 800 person engineering org, with close to 100 infrastructure engineers

Their platform team had created Cement, a Go SDK that wraps platform capabilities like service discovery, dynamic configuration and secret management for application developers, however one such app team that was using Cement, called Project Orion, diverged from it by forking Cement into a tailored version called Oregano to close gaps for their specific workflows and service development approach. Since the Project Orion team was small, they couldn't afford to re-integrate their changes into Cement, so they continued building and diverging from the original.

Over time, the divergence grew, both as time passed and as the Project Orion team itself expanded. This meant whenever the Platform team wanted to make updates to their tech or SDK, they had to push the Project Orion team to duplicate those changes in their Oregano fork as well, otherwise their SDK would stop working.

Other product teams recognized Project Orion was actively investing in Oregano with a developer lens that made sense to them, leading some to start using Oregano instead of Cement. This highlighted that investing in the original wouldn't provide immediate value to Project Orion or other Oregano teams unless they returned to using it. So in addition to the technical challenges of supporting the features those teams needed, there was now the organizational pressure of convincing them to revert back to Cement. That pressure couldn’t just be resolved with wrist slapping, because the teams were actively investing into their SDK fork, and for them to re-align around Cement, they would need the platform team to own those changes and prioritize investment into the mainline SDK. That either meant roadmap changes that would leave someone else unhappy, or renegotiating headcount that will be a total net increase for the platform org.

This experience demonstrated the complexity that wrapping and abstracting company-specific tools and ways of working can lead to, both technically and organizationally.

Use Boring Tech, For Everyone and Everything Else

The Head of Infrastructure of another company with a similar profile recalled an insightful architecture leadership meeting that revealed typical challenges in aligning technology decisions.

The setting was a conference room, with the chief architects from each product group gathered together, along with the CTO and Head of Infrastructure. As part of the regular cadence of cross-company alignment, this week's focus was on adopting 'boring tech' standards, influenced by what at the time was a recent blog post.

At first, it seemed there was agreement on standardizing tried-and-true technologies versus chasing cutting-edge yet unproven ones. However, as the conversation progressed, subtle signs of misalignment emerged.

The catalyst was when one group proposed investigating Istio to potentially replace the existing service frameworks. Sensing that there was more at hand, the Head of Infrastructure began asking each architect about the tech choices they’re facing in their orgs.

For example, one group was considering Kotlin instead of Go for server-side development, while another was exploring Riak, a technology that its founding company later went bankrupt. Through this line of questioning, it became apparent that each organization really wanted the other orgs to adopt the boring tech they were already using or didn’t have to use, while they could get an exception for that one use-case that they believe required the cutting edge wunder tech. Boring tech was great, when you weren’t the one using it.

This seemingly uncontroversial stance turned into heated debates and prioritization questions that took a while to resolve, even with the best intentions.

Conclusion?

Centralizing platforms brings unintended costs as product orgs are brought together under a new paradigm. To pave a smoother path, leaders must get buy-in across the organization from the outset to align current and future users of the platform, especially from the most dominant product group in the company.. Preventing needless customization and fragmentation is also key to easing the transition in both re-learning and re-implementation costs.?

Anticipating these organizational and cultural challenges upfront allows leaders to staff teams to bridge the increase in both engineering work and the need for continuous cross-org prioritization and alignment. Choosing to adopt the platform engineering model after having focused on the infrastructure-org model will require the org to shift into becoming a product org in itself, requiring a new mindset, and a new team composition.?

Making a proactive effort and ensuring a clear vision that aligns with reality leads to platform initiatives succeeding amidst the dynamics that will inevitably challenge them.

Stay tuned for part 2 of this series!

follow us to get new articles sent to your inbox. We’ll unpack more real-world perspectives to help you lead platform initiatives at your company with insight and care.

??


Yev Spektor

?? Jolt AI understands 100k to multi-million line codebases ??

1 年

Thanks for creating and sharing this, Ala Shiban

要查看或添加评论,请登录

Ala Shiban的更多文章

  • Adaptive Architectures

    Adaptive Architectures

    AWS re:Invent started today and we'll find out what the Amazonians have been cooking for us and which startups will be…

    2 条评论
  • Specialized Clouds

    Specialized Clouds

    Happy Please Take My Children to Work Day! This is the 2nd edition of The Next Cloud Architecture newsletter, where we…

    1 条评论
  • Serverless vs. Microservices: Two Sides of the Same Coin

    Serverless vs. Microservices: Two Sides of the Same Coin

    (cross-posted from the official klotho blog post. We're also hiring! DM me for more info) TLDR; If you really boil it…

  • The Cloud Architecture of the Next 10 Years

    The Cloud Architecture of the Next 10 Years

    (cross-posted from the Klotho blog) It’s 3pm PST on June 2nd, 2020 and the world is watching. Riot’s new game VALORANT…

    44 条评论
  • When Amazon, Microsoft and Riot Games' Cultures Fuse

    When Amazon, Microsoft and Riot Games' Cultures Fuse

    This journey wouldn’t have been possible without these wonderful peers and managers: Michael Gesner, Tyson Trautmann…

    1 条评论
  • 7 Activities To Celebrate Teams in Tech & Beyond

    7 Activities To Celebrate Teams in Tech & Beyond

    We all strive to create an environment that makes employees love coming to work. I’ve found that engineering-heavy…

    1 条评论
  • Ways That Will Help You Succeed In University

    Ways That Will Help You Succeed In University

    In principle, one purpose of universities is to rank students in order to find top researchers and help different…

  • 4 Unofficial Things I Love Doing as a PM

    4 Unofficial Things I Love Doing as a PM

    (Read the original on Ala's Blog) Being a PM has been the most fun and differentiated experience so far in my…

    10 条评论

社区洞察

其他会员也浏览了