Do you know your data? Traps you can hit in the fast-growing AI world.
It has been a real pleasure to host Prof. Paolo Dell’Olmo, Director at Master in Data Intelligence and Decision Strategy (DISD), in Cisco Italy to talk about Big Data, Artificial Intelligence, the relevance of human experience and, finally the result we achieved in our Master Project – identifying a predictive and prescriptive model for employment based on training and hiring past data (more info to follow about this in a future post with Vincenzo and Giovanni, my Master-Project mates).
There are some key points Prof. Dell’Olmo highlighted that I’m a strong believer as well:
- First, data. It’s very rare to have a clear, clean and meaningful set of data ready to be used to give insight to your decision problem. Actually, in real life projects, it’s very likely you’ll need to spent the largest percentage of a project time cleansing data from any sort of issues (e.g. nulls, inconsistencies, etc.). Currently, that means, in broader terms, that you NEED TO KNOW YOUR DATA. This is not only necessary to have a clean set to start from, but also to understand what is the best (set of) algorithm(s) to transform your data in the information and knowledge you’re looking for.
- Algorithms. It should be common sense to understand what you are using to solve a problem, is it? Well, it is not. Lots of people I met, generally follow specific path just because there is the common idea Machine Learning or Deep Learning can be trained to do everything. Well, even if it was the case, and it is not, what would you expect if data changes happen? What if different patterns appear? The result is, often, a very bad output from your model and, possibly, a big issue in your project. More, several powerful algorithms are working as black boxes (e.g. neural network), so they are not necessarily providing you info about what are the most relevant variables in your analysis and this is something you definitively want to know.
- Visualization: an image is worth 10.000 tables. Something I learnt during the master is to use live data visualization techniques (e.g. dynamic alluvial flows), they can be so powerful that I’m often considering using them instead of standard images and graphs in slides during presentations. Data storytelling is a must!
- Human experience. This is the real enabler that ties all of the above in a meaningful and valuable results. There is no Artificial or Machine driven substitution to the human capacity to understand the domain and the context of a problem, to explore and compare multiple paths for a solution, to seek for the best optimization that may not necessarily be the most mathematically optimized solution. At the end of the day there is no one with a bigger vested interest in solving our problems and satisfy our own needs.
Thanks again Prof. Dell’Olmo for your time, your presentation and, personally, your guidance in the past couple of years.
Consulente innovazione digitale e analytics
5 å¹´I totally agree your points. Especially the last one. Although in my brief experience, I'm seeing that much time of a project is spent on setting expectations about what Machine Learning, Deep Learning, and AI in general, are able or not to do. Their outputs are an informative support, not actionable solutions. These latters are still under the competence of humans decision-makers.