登录查看更多内容

The Hubla CTO Diaries #3 - The one about the war room

David Reis

Enabling better decisions (credit, anti-fraud, etc.)

发布日期: 2022年3月7日

In the last last two weeks since my last report a lot has happened. First Carnaval, where all of Brazil stops for a 5 days to party or relax. The old joke goes that Brazilians only start working after Carnaval, so I guess we have no excuse any longer :-)

Second, I had the chance to put a war room together.

War Room from the movie Dr Strangelove. A B&W picture of a war room with a large round table with many generals around it.

The quintessential war room from the classic nuclear end of the world comedy Dr. Strangelove. Unfortunately it's a more relevant movie now than any time in the last few decades.

What is a war room anyway?

A war room is a tool we use to deal with serious situation - one which the company's normal processes are not enough to handle. Another name for the same thing is code yellow. When a war room is created, a small team is mobilized to work full time on the burning issue until the situation is under control.

Hubla's Reliability War Room

The war room we created was to deal with Hubla's reliability problem. Today we release code daily, but our end to end test coverage is not enough to ensure serious bugs don't hit production. When a bug like that happens, our users can have a really hard time until we fix the issue.

领英推荐

Day 319 of 1,095: The Art of Dividing Responsibilities…

Eliav Ser ??????? 1 个月前

Overcoming the Depths of Hyrule, or Zelda and CLM…

Jerry Levine 1 年前

Looking to level up your IT Operations game?

Ken Gavranovic 1 年前

The strategy of the war room is pushing code to production weekly rather than daily, and only doing that after it has been thoroughly manually tested. This scheme goes against engineering best practices, which are to release often and automatically, but we do that consciously. By paying a large price in cycle time we will get a large and immediate gain in production reliability.

While the war room will have a few people working on it (writing test scripts, automating them into end to end tests, configuring releases, etc.) we appointed a single person to be responsible for the health of releases pushed to production. That person can then feel empowered to be the guardian of reliability. It's always important to have cool names for these sort of roles, so we went with release sheriff ??.

The sheriff will manually validate the weekly release and push it to production. They also have to handle requests form people who want to create ad-hoc releases to fix a bug or to get get a time-sensitive feature out sooner. We picked a Hubla long-timer who knows a lot about our systems and knows everyone in the company. That way they will be able to judge what fixes are important enough to justify an ad-hoc release, and be comfortable enough saying no.

Winding Down the War Room

Another important aspect of war rooms is that they need to end. We could end ours by declaring that we will forever do weekly manually gated releases, but that's simply not good enough. Healthy software orgs need to release fast and safely.

The way I see it, the war room is creating tech debt by changing the release strategy, so for the sake of our long term health it needs to also pay back that debt. The way we made sure that'd happen is to clearly specify the war room exit criteria at the outset: It will end once have end to end tests for all of Hubla's core flows and we are able to safely release daily or even hourly. That was made clear to folks mobilized to work on the war room as well as their managers.

In conclusion

It's not fun to declare a war room. I'd much prefer the boring routine of project after project. But startups move fast, and sometimes you build yourself into a situation that requires targeted action to get out of. If you notice that fast enough and can put together a quick response, you can unwind a bad situation and get back to the well lit path.

Bruno Kim ?? Medeiros Cesar

Knowledge Interface Manager | Staff Data Engineer | I put an emoji in my name to screen bots | MSc Comp Sci | Ex-Google, ex-Amazon

3 年

Great write-up! I liked the exit criteria with a clear, achievable goal, and I'm wondering: was there any request for a time limit? Or, at least, an estimate on how long the war room would last? I believe that even with a goal in mind, other people may get nervous not knowing if it will last 1 week or 1 month

查看更多评论

要查看或添加评论，请登录

David Reis的更多文章

Diário do CTO da Hubla #7 - Criando Valores de Engenharia

2022年12月10日

Diário do CTO da Hubla #7 - Criando Valores de Engenharia

Access an automatic English translation of this article here. Hoje trago um tópico muito relevante para construir times…

1 条评论
Diário do CTO da Hubla #6 - Lan?ando um Plano de Carreira

2022年6月15日

Diário do CTO da Hubla #6 - Lan?ando um Plano de Carreira

Access an automatic English translation of this article here. Na última edi??o do diário do CTO, contei sobre como…

2 条评论
Diário do CTO da Hubla #5 - Criando um Plano de Carreira

2022年5月16日

Diário do CTO da Hubla #5 - Criando um Plano de Carreira

Access an automatic English translation of this article here. Desde que eu entrei na Hubla eu sabia que alguma hora…
The Hubla CTO Diaries #4 - The one about schedules and deadlines

2022年4月11日

The Hubla CTO Diaries #4 - The one about schedules and deadlines

It’s natural for folks who build software and especially those who commission it to want Software projects to be…
The Hubla CTO Diaries - Week #2

2022年2月21日

The Hubla CTO Diaries - Week #2

As I promised last week I plan to do a weekly post about starting my Journey as CTO of Hubla. I’ve now completed my…

See all articles

The Hubla CTO Diaries #3 - The one about the war room

David Reis

Enabling better decisions (credit, anti-fraud, etc.)

领英推荐

David Reis的更多文章

社区洞察

其他会员也浏览了

Cinegy Professional Services: where next?

Saying ‘No’ to Being Bought Out: Rookie CEO Grows Up… Reluctantly

Blaze Your Own Trail

This week in Flyte (March 18 - 22)

Dai vs Goliath (Not a Netflix Original.. yet)

Queue-it and online fairness

A Safe Space Helps

Launch Your Web3 Economy Today with Liteflow’s Token Tool Suite

Don't Grip it and Ship it!

The Normal Mistake

领英推荐

David Reis的更多文章

Diário do CTO da Hubla #7 - Criando Valores de Engenharia

Diário do CTO da Hubla #6 - Lan?ando um Plano de Carreira

Diário do CTO da Hubla #5 - Criando um Plano de Carreira

The Hubla CTO Diaries #4 - The one about schedules and deadlines

The Hubla CTO Diaries - Week #2

社区洞察

其他会员也浏览了

Cinegy Professional Services: where next?

Saying ‘No’ to Being Bought Out: Rookie CEO Grows Up… Reluctantly

Blaze Your Own Trail

This week in Flyte (March 18 - 22)

Dai vs Goliath (Not a Netflix Original.. yet)

Queue-it and online fairness

A Safe Space Helps

Launch Your Web3 Economy Today with Liteflow’s Token Tool Suite

Don't Grip it and Ship it!

The Normal Mistake