My analytics toolkit
Choosing the tools to use when performing analytics can be as daunting as picking hair care products at the supermarket.? Astounding overwhelming variety.? Features upon features upon features.? Fancy products with catchy buzzwords solving problems that I had no idea existed, but which sound like things which I MUST resolve imminently or risk doom and destruction.
A wise grumpy old Englishman once taught me “do three things well, four things badly”.? And, in keeping with this mantra, I used to have 3 tools that were my go-to tools when I was trying to plug into a business through its data.? I say “used to”, because recently, I had to break my rule and add a fourth to the list.?
So cue some soulful, jazzy music by the Roots, and pretend you’re on The Tonight Show with Jimmy Fallon, as we run through the exciting world of my top 4 recommendations for your analytics toolkit!
1.?????? KNIME - The Swiss Army Knife of Workflows - to prep it
If you don’t know what KNIME is, and you want to start doing more with data, google it!
Their website says it all “KNIME Analytics Platform is a free and open source low-code/no-code software that allows anyone, irrespective of experience levels, to make sense of data.”? I’ll say it again.? Free.? No Code.? Make sense of data.? If Alteryx is George Clooney in Oceans Eleven (all suave and sophisticated and fancy suits), KNIME is Matt Damon in Good Will Hunting (kind of nerdy, takes a while to get going, has some perception issues, but is actually incredibly smart and powerful).?
It lets you build pipelines that will allow you to transform your data in the same way over and over again.? You can handle exceptions.? You can handle bad data (usually).? The old adage of “garbage in, garbage out” can be a thing of the past, if you invest the time to build a workflow that takes your garbage and upcycles it into clean usable data? If you want to, you can sprinkle some machine learning in there, maybe a classifier or two to speed up some manual tags that you would normally add yourself.? All by dragging in some nodes, and doing a bit of config.???
If I wore a hat, I would take it off to the community that put their time, effort and enormous brains into contributing to KNIME.? They genuinely do the world a service by contributing to this outstanding platform.? There are so many nodes, for so many use cases, that have all been prepared by the network of geniuses that contribute to the platform (both from KNIME and the community).? I can’t rate it highly enough.?
?
2.?????? SQL – in any flavour – to store your data, mash it up and curate it
MYSQL / Microsoft SQL / some sexy serverless SQL in the cloud.? It doesn’t matter, just get your data in SQL.? I would personally love to see a day where Excel is replaced by a SQL server running removely on every business machine, and everyone stores their business data in SQL tables that sync to the cloud.?
If you can’t read and write SQL queries but want to improve your data literacy, learn. Now.? While everything else in the world is transforming, and paradigms are shifting, and splines are reticulating, the basic syntax of SQL has not changed.? In my view, when you know how to competently extract data out of a SQL database, you can surf your way through mountains of data, and draw comparisons between datasets, and do so in a way that is repeatable and efficient.?
The effort to learn it is worthwhile, and there are some great courses online / at universities that you can do in a day to give you a jump start.? If you use tools like Power BI and Tableau, the data manipulation concepts of joining data becomes so much easier.? You will think differently about user interfaces and data capture controls.? Seeing it in the database helps you understand your data much more deeply.?
Organisationally, I also think it’s also important to make writing to SQL databases more widely available, and this might mean spinning up more databases or schemas, to protect your organisational data.? Everyone wants “one source of the truth”, and of course you don’t want every analyst in a large company coming up with a different definition of what “sales” means.? But you also want to encourage creative thought, and ensure that you are not forcing controls on writing new tables until its tested by 7 different sub-committees that drains the life and enthusiasm from your analysts.? What if I’m an analyst that is working on exploratory work, or analysis for my own understanding, and I don’t know what I want when I start looking, but I know I want to structure the data to ensure an accurate result.? I might spin up, and editing and deleting tables many times an hour in an exploratory phase of an engagement.? I don’t want to be locked into a fixed format going to change request boards when I’m just exploring the relationships in my data.? So a locked down, read only restricted access level isn’t going to work for me.? In an age of super low cost storage space, we should make spaces like OLAP dev environments available to allow the right people to push the boundaries of what they see in the data, without compromising the important OTP databases or the organisation wide OLAP data repositories.
领英推荐
?
3.?????? Power BI – to see it and share it
I started as a Tableau guy.? I loved the clean lines, and simplistic UI and speed of analysis, and many datapoints.? But that was a fair while ago, in a time when SAP were still selling Lumira.?
Now, unfortunately the price point and integration of Power BI is so hard to walk past.? If your company has Tableau, great, use it.? If you are starting out, and you want to be able to see your data in a replicable format, I would go with Power BI.
Sure, you can also knock up stunning, amazing data visualisations with more datapoints in code based tools like python or R.? But why spend the time.? In my experience, you will spend 80% of your time getting the last 20% of refinement, or doing what is already automated in Power BI, when in reality most business units don’t need perfect, pretty vizualisations – they need insights and they need them fast.????
With any visualisation tool, my caveat would be – keep your data transformations out of your data visualisation tool.? E.g. If you have a method for combining weather data with your sales data, you want that data source in SQL where you can access it at any time.? You don’t want it in a workbook or workflow or workspace in which only you can see it.? There is tremendous value in curating your data sources, so having them in your most robust data store facility in a format that allows you to re-purpose and reuse it is best.? Given that, I don't stress too much about the data prep functionality of data visualisation tools. I do all the prep work before it gets there because I like my SQL databases to be complete.
And, last but not least, the fourth addition to my toolset...
4.?????? All the Generative AI’s – to make you go faster.??
I use all the AIs.? I love them.? Since ChatGPT exploded, the growth in this space has been astounding and I find myself using generative AI tools daily now. I don't stress which one in particular (noting that as a rule I keep any sensitive information away from these tools). In my experience, each of the big generative AI tools are all good enough.? I wouldn’t agonize about which one.? I would agonize about how much human effort is wasted on stupid tasks for every second that I didn’t have an organizationally endorsed generative AI tool.???
Personally, I use generative AI most often as a crutch when I’m coding something and I cant remember how to say it in a given language, so the built in Visual Studio Code AI tools like Amazon Q are my favourites.? I switch between writing stuff in Power BI and SQL, I build apps in flutter, I occasionally use a bit of python.? And I’m not as young as I used to be.? So rather than committing every command in every language to memory, I just ask whatever AI buddy is built into the tool that I’m using to point me in the right direction.? I don't feel shame in admitting that I don't remember every command for every language that I use. What has changed is that rather than searching through reams of reference specs for languages, I find a solution to the task in a fraction of the time. The auto-complete functions which predict and complete your code for you are also AMAZING these days.?
If I’m stuck on a complex task I might throw it a “I need to iterate through this list and find x,y and z.? How might I do that in [insert language here]” at it and see how it goes.? They aren’t perfect, but if they don’t nail it (which is rare), at least it might give you an idea of a method that might work. It's important to know the fundamentals to be able to call BS on a solution that won't work, however it will using only take a nudge to get them back on track. Back in the day, we would stop at an impasse and have a coffee (or a smoke if that's your thing) and ponder how we might solve a tricky task when we get stuck. We might even sleep on finding the solution if a coffee break isn't long enough.? Now I just get the machine to tell me how, and go and enjoy my coffee, and then continue on my merry way.?
I also use it when I’m just finding that I’m going to be slow in finding a solution.? We’re human, we get tired and distracted and have personal lives, but supplementing our brainpower with AI can help us maintain a baseline performance, rather than ebbing and flowing.? Saving a few minutes on something that isn’t core is just good business.? Like for instance, generating an image of a supermarket with too many haircare products.?
Also remember, these supersmart Generative AI robot overlords might be our rulers one day, so be polite, and drop a “please and thankyou” in there.?
Chief Financial Officer at Bis Industries
6 个月Very interesting and comprehensive Brad. Thanks for taking the time to share