Keeping Support in Work, Part 1

I’ve been working in various aspects of Software Support for over 20 years now (21 years on April 18th 2016, if you’re counting, which is a scary thought!) and over that time I’ve seen lots of repeated software errors – things which come around regularly and that are guaranteed to keep the Support folks busy, but which could easily be avoided in development. 

With a bit more thought and care, these types of errors could be avoided; reducing outages and increasing customer satisfaction as well as reducing the cost of Support and letting development teams focus on new products and features instead of burning man-days in sustaining engineering.

Below are some of my 'favourites' - simple errors that can take ages to identify and fix...

Using INT data types for a Key

Using an unsigned INT as a Key field looks quite sensible initially… After all it will allow roughly 2 billion IDs.

Unfortunately you’re probably going to design a key-regime that increments by 2 or even 4 each time, so that 2 billion could easily be only 500 million or so. How long do you think it will take an Enterprise class data system to create that number of records? Remember that these systems will last for years and you can’t re-use the key *ever*….

Combination unique keys

You need a unique identifier for a record that is composed of two parts. Fortunately each part already has a unique ID, so all you need to do is concatenate those to get the unique ID for the combined record…right?

Nope! Using this approach the records with IDs 217 and 65 generate a key of 21765. Unfortunately that is ALSO the key for the combination of 21 and 765. I once spent 6 months explaining and demonstrating this to a supplier…

Obfuscating Error Messages

We don’t want scary stack-traces appearing in logs where the customer might see them; let’s just catch the exception and substitute with a generic error…

I worked for years with a middleware system that reported any unexpected error as ‘Status 8’, which translated as ‘the developers expected 7 different failure modes and dumped anything else to a catch-all'. In Support we used to say Status 8 meant ‘Something went wrong, but we’re not telling you what!’. Debugging these messages took hours of investigation – usually just to get to the real error, after which we could fix it in a few minutes!

Next time I'll mention some QA and Project problems that cause similar issues...

Happy Thanksgiving everyone!

Jonathan Chivers

Helping companies tranisition to AI, & ML powered service centric Operations

9 年

And then there is the all purpose 'Java null pointer exception error'

回复
Daniel Hall

Software Engineer at SHD

9 年

I just made the move from support to software engineering, I will somewhat miss the customer interaction but it's exactly what you've written that's pushed me over, I'm looking forward to a happy support team that can pull meaningful error messages out of logs...it's really not that hard to develop :-)

回复

要查看或添加评论,请登录

Tim Wolfe-Barry的更多文章

社区洞察

其他会员也浏览了