Congratulations! You Blew Up Production: 3 Steps to Fail Like a Pro

Congratulations! You Blew Up Production: 3 Steps to Fail Like a Pro

A few weeks ago I had to cancel my weekly meeting with the Not Another Course community because of a critical incident at my job. One that I caused.

I made a massive update to one of our internal libraries which caused some critical forms not to submit, preventing users from adding items to their cart on our e-commerce site. Oops.

This wasn't my first big blow up.

It won't be my last.

The bug got fixed fairly quickly but I was still embarrassed. I documented the action items to prevent it in the future and will have a meeting to discuss it with my team. I'm also the manager on this team. Awkward.

I know that many of you reading this are either working as developers or searching for your first role. This is going to happen to you too. In fact, it should happen. If you're doing anything remotely interesting or moving quickly, you're likely to break stuff.

The difference between a junior and senior dev is NOT that the senior does not create critical incidents. It's how they handle it.

Here are 3 ways you can tackle your next blow up like a damn pro.

#1 Revert that shit with Git

Too often, developers choose to fix a critical bug without proper investigation.

They just want a quick solution. Only to end up creating more bugs on top of the one they introduced ??.

Instead, developers should focus on minimizing risk.

Reverting a change that caused a massive defect is the simplest solution. Luckily, this can be trivial with Git.

Git's revert command is a handy tool to quickly roll back harmful changes. It creates a new commit that undoes the changes made in a previous commit. Here's how to use it:

  1. Identify the harmful commit: Use git log to find the commit hash of the problematic change.
  2. Revert the commit: Use git revert <commit-hash> to revert the changes made in that commit. This command creates a new commit that undoes the problematic one.
  3. Push the changes: After confirming that everything is working as expected, push the new commit to the remote repository and deploy the changes to production.

Here's a video of me going over it.

If your team uses a deployment pipeline that contains versions of previous deployments, rolling back becomes even easier. Understanding your team's deployment process will save you a lot of confusion when you're under pressure.

#2 Debug

Good debugging skills can save the day.

Bad debugging skills can be the difference between a 1 hour fix and spending days on a problem.

But a legendary debugger can solve problems they aren't familiar with at all.

Sometimes a rollback is not on the table and you must come up with a solution. This requires a solid debugging process:

  1. Replicate the bug and document the exact steps to reproduce.
  2. Identify the file/s that are potential suspects.
  3. Check observability tools and error logs (look up New Relic and Data Dog if you're not familiar with that term)
  4. If possible, use a debugger tool to prevent using tons of console logs
  5. Verify the fix and have another dev review

Here's a popular article I wrote on debugging using the REST method: https://www.yourcodecoach.com/blog/debugging-javascript-code-using-the-rest-method

#3 Just don't write buggy code in the first place

Prevention is better than the cure. Conducting thorough code reviews is a good way to prevent bugs from reaching the production server in the first place. Here's how to make your code reviews more effective:

  1. Define a clear process: Ensure everyone knows what to look for in a review. This process may include checking for adherence to coding standards, proper error handling, and running the code locally to ensure it works as expected.
  2. Use code quality tools: Linters, unit tests and e2e tests can be automatically triggered on a commit. These can be complicated to set up but yield a lot of benefit and confidence when you ship code.
  3. Encourage open communication: Code reviews should be a time for learning and improvement. When they become a chore or just a formality, they lose their power. A junior should be encouraged to ask questions of the senior and vice versa. No one is above feedback.

You're Going to Make Mistakes

Want to just get by as a developer? Not concerned with promotions or accelerating your knowledge?

All good.

Keep taking on assignments you know you can finish that pose little risk. You probably won't need any of the advice I've shared here.

Big risk = big reward.

Introducing a new process, feature or making large improvements come with the chance that you break stuff. It's a matter of when, not if.

Having the right skills and tools at hand mean that you can limit the impact of these blow ups.

Want to join me and over 100 other devs tackling the stuff that you WILL encounter on the job?

Check out the Not Another Course community.

Helcio André

Software Engineer | JavaScript, Typescript, Rails, React, Redux, Angular, Tailwindcss, Figma. Learning and applying cool stuff on the web.

1 年

This one is to save and come reading again and again, thank you this makes me feel like going out there without the fear of breaking stuff in order to get better at whatever I may put my mind into mastering or become really good.

要查看或添加评论,请登录

Brian Jenney的更多文章

社区洞察

其他会员也浏览了