The daily struggles of Data Analysts.
Data analytics might seem easy. You just pull some numbers and create graphs and that’s it, right?
Not necessarily. I have shared my insights about data analysis and its connection to process analysis in my previous article and the second part of it, both published by Digital Hub Warsaw I Bayer , but now I want to dig a bit deeper and explore the issues we, as Data and Process Analysts, come across when trying to conduct analysis. This will come down in the end to?– why we need Data Analysts' superpowers and why the numbers cannot, in the end, be crunched by the Process Owners themselves, as a "side-gig” to all their daily responsibilities.
What I will show you are the most common issues we encounter both on the process side, as well as the human side.
The process side:
What you have to understand about corporate IT is, that rarely the process (as in, inputs, outputs, outcomes and whatever happens in between) is owned by just one team. Often, multiple teams provide inputs, and other teams (plural) use the outcomes as inputs to their processes (plural again). While it might make sense from the corporate standpoint, it makes things just a tad more complicated for Data Analysts. ?
First wall you hit when you enter corporate IT world is the sheer number of applications / systems / databases they use to track their processes. Usually, one process is tracked in multiple systems (as it is owned by different teams on different stages), and those systems are not necessarily connected to each other...
Why is this an issue?
There is a few factors:
One: Any split process may (and will in the end) suffer from data disconnect if the databases are not “talking” to one another. Someone will fail to put the data everywhere, because they mainly use one system, and they do not think that the other systems are needed. This will, eventually, create a mess. But hey, Data Analysts thrive in messy environments, right? This is one of our superpowers.
Two: Any process tracked in multiple systems means that you need to connect with multiple data sources, which creates more overhead – sheer connection might be an issue. Every database has its owner, every one has its own restrictions on access and their own privacy rules. And the owners usually guard their data as it was a national treasure (even if you just want a read access). Don’t get me wrong, some data needs protection, for example the one that falls under GDPR, but let’s be honest, the number of incidents per month is NOT a very sensitive information. ?Analysts have to be able to jump through a lot of hoops to get the access they need. Sometimes this proves to be near-impossible.
This would have been solved by a centralized team of Data Engineers, who could safeguard data access and build a complex, yet coherent infrastructure full of databases, that actually sync to one another... but this is just one of those dreams Data Analysts have... A lot of the time a good Data Engineer is like a mythical creature, that everyone knows about, but none have seen in real life.
Three: mapping the data is often an issue, because contrary to logic, not always is there an obvious connect between data various systems (again, owned by different teams, with different priorities). This happens even if the data is supposed to describe the same IT asset for example. In simpler words – the names of the given IT assets in different databases might differ. What do you do then?
The second issue is when you try to find out what KPIs the organization has defined. Sometimes it turns out there are none or they are defined in a way, that you cannot measure anything.
Example: “Keep the timeliness of resolving tickets on a good level”. This is not a real-life example, but I have seen KPIs defined this way.
Why is this KPI not good?
“Good level” can mean 80% to me and 95% to someone else. Timeliness, as I learned, is also not a self explanatory term. I have heard that “30 days after deadline is NOT LATE YET” – so I should treat it as timely. I disagreed, but in the end I am just a simple Data Analyst, and the Process Owner owns the KPI.
There should not be wiggle-room in KPIs that are defined for the whole company or even a division or a team. Everyone must understand the KPIs in the same way. And KPI must have a target.
Best case scenario – write them down as mathematical equations. There is nothing to interpret in KPI1=X/Y where X = number of (whatever you measure) completed within the timeline given by the process, Y = all (whatever you measure). Target KPI1 >= 95% (0,95)
This will give you pretty much what the “Example” wanted to say, but in a way where no one can argue and interpret the KPI their way.?
There are, of course, more issues that we struggle with… but let’s just not talk about it. If you solve the ones above, we will thank you!
Once you have realized all this is more complex than you thought, you might actually call a Data Analyst to fix it for you. You figure out, that you need help, because it suddenly became more complicated than you imagined, and it no longer is just a simple count or vlookup in excel. Now you need the superpowers only Data Analysts possess.
Can we, as Data Analysts, work around all this?
领英推荐
Yes. This is why you hired us. We create order where there is only chaos, we make it clear where clarity is non-existent. We do it day-in, day-out and usually, we do not complain that much ;). We are magic.
Now, the human side:
Some say there are no stupid questions. That may be true... These are, however, my favorites.
NOTE: I assume the calculations and logic of data analysis is correct, as this would have been the first thing I check.
?“WHY is this my report not in line with the other report, that another team created”.
If all calculations are okay, but the other team reports form the other system this process is tracked in, then… well, most likely, we cannot help you. We will of course cross check it, we will try and find the root cause. From my experience, most of the time, the answer in the end comes down to: "You created the mess yourself (or your Data Owner / Data Engineer did it for you)" :) You have multiple sources of information for this process, the other team uses the other data source, and… surprise-surprise – they are NOT synced. Data Analyst is NOT to blame for the data quality. Data quality is on Data Owner, please, talk to them. Don’t shoot the messenger. My former manager had a saying - "s*** in, s*** out", which loosely translated to “if your data is not clean, your report will tell you nothing”.
“WHY is it all red?!”
You know how you mark everything that is not in line with the process standards / KPIs with the color red? Managers tend to react to red like bulls do – they get frustrated / irritated. Some (I hope it's most of them) go and ask the Process Owners and their team the above question. They do it in a calm manner and motivate the team to do better next time. The rest gets angry with the poor analyst. Once again – do NOT shoot the messenger. We just grouped / analyzed the data for you. The data is telling a story, not us. If you do not like the narrative, go yell at the data ;).
“Can you make it green?”
The premise being: Red – wrong, green – good. Short answer is: “No we cannot. You can.” You are the Manager / Data Owner / Process Owner – you have the power to change the process or motivate your team or clean up the data – whichever is needed. We can help you analyze what to do so we go from red to green as quickly as possible, but we should not manipulate the data to make anyone look better. We have a code.
?
And now for the "smart" questions, that are actionable:
“How do I make it better?”
You guessed it – this is the one that opens the doors to process analysis and PDCA cycle. This is where we spread our wings and fly. This is what most of us like and where we thrive. This is where root cause analysis comes into play. Me myself, I often do the RCA just for fun. To be prepared to answer a smart question like this. And any Process Owner, who asks me this question, quickly becomes one of my favorites. Even if they were very demanding and fussy ?when we created the dashboards. This tells me they know why I am here and they want me to help them. ?
“How do I make my report speak the same language, the other teams report does?”
This can be translated to – “how do I fix data disconnect?” – well, either get rid of some systems and integrate all data / info about a process in one system, or at least create an API so they can update each other. Last thing you can do, which is both the quickest and easiest to implement and least likely to fix the issue permanently – implement unique identifiers for the data across various databases, so we can at least compare and identify the disconnect and then take an action. Manual data correction, however, is not perfect and it is a very tiresome process, especially when we are talking about live data that changes every second.
And, yes, I know corporate IT is trickier than that and you probably can't fix all of it by yourself. But, please, just try...
?
The outcome of what you do with the analysis we provide is up to you. Your attitude will determine if you get angry at the analyst (or data) or if you will follow our direction and start asking the right questions and maybe then, make your process green.
Do you want to learn more about data analytics, BI or process analysis? Let me know in the comments.
?
New Functions and General Services Operations Manager w Bayer sp. z o.o.
4 个月Super interesting!!!
Senior Expert Visual Designer developing in Data Visualization
4 个月Ewa Krupa thanks for these insights! I enjoyed this article, I'd like to learn more :)
Expert in Enterprise Coaching, Change Agent, Experienced Leader
4 个月I would like to read more :-)