Subscription Engineer #15: Monitoring is a priority

Subscription Engineer #15: Monitoring is a priority

One of my favorite TV Shows is House, MD because I like the way they discover the illness by doing all kinds of tests and analytical work. Sometimes my work in digital is similar because the same symptoms could be signs of different illnesses. Because of that it’s very important to have the right tools in place. In my practice it should be a combination of analytical systems working very close with technical monitoring. Let’s review an example.?

One day you are looking into your analytics and you notice that you started to have less subscribers. You ask your dev team to check if the website works correctly by they don’t see anything suspicious right away so you have to start digging.?

First thing to do - take a look into your conversion funnel and if you use Mixpanel you can do that with no problem. So once you do that you see that something wrong is happening on the payment page because less people get to the thank you page.?

You still need to get more information so you add events into each meaningful step of the checkout process - filling form, address autocomplete, tax calculation, discount calculation, address verification and transaction completion. After a day of analyses you notice that there is a big drop after address verification so you ask your team to check what’s wrong.

One of your backend engineers goes into Kibana or Grafana with Loki and checks API requests to your address verification partners and notices that in some cases instead of expected response the vendor responds with some nonsense. Further investigation shows that it only happens with California as a state. But since it's the number one state in terms of sales you obviously are down from a sales perspective. Your team creates a support ticket, it’s resolved in a few days and you are all good.

Let’s imagine another scenario. You don’t see any issues in your BI tools but one of your team members sends you a video that something is not working. You ask him to clean cookies, everything works fine but you still have a feeling that something is not right. So you need to have a way to regularly check your production environment and catch this flickering error.

In our case we use NewRelic Synthetics that checks every 15 minutes from 3 various locations in the United States that your website is up, customers can complete the checkout process and on the thank you page you make sure that all of the marketing scripts are being triggered correctly.

Once this script is up and running at some point it fails but now you have all kinds of technical logs and you notice that there is some shady behavior from a script provided by a marketing vendor. You pass this to the frontend engineers and they discover that recently this partner updated their libs and now they are incompatible with another script on your website. You disable that script, create this ticket and soon enough all works correctly.

In our history we had various partners, production issues and similar stories so whatever tool you are using they should allow you to do next:

  • Track individual user activity on your website including specific clicks, forms completion and navigation
  • Your technical monitoring stack should allow you to connect activity on the website with communication between microservices and databases
  • You still have to be compliant with all of the privacy regulations

And last but not least - never ever think that a 3rd party vendor has a perfect logging system that will allow you to just send a customer email and they will solve it. In some rare cases it’s true but in our history we had two vendors with whom we spent a lot of days trying to prove to them that the problem was on their side. And time is always money.?

Happy debugging!

Very insightful.

回复

Andrei Rebrov we are currently bingeing House! If you can’t find a pill popping narcissistic genius, what tools do you and your team prefer to use? I’m obviously partial to mine, but love to hear about what your go to is.

回复
Brady Morgan

Account Executive @ Accushield

2 年

reporting is paramount to long term business health - is your team making the switch to GA4 anytime soon? Was messing around with the backend today and it is fascinating

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了