What makes a data problem interesting, unique and challenging?
Complex data challenge sounds really cool and intense, but let's take a step back and understand what really does complex mean. Here is my take on declaring a data problem as "Complex", and my thoughts are curated by my experiences, learnings from multiple sources and articles as well as some personal challenges/roadblocks I have faced that have served as learning lessons.
In my opinion, complexity is what makes the data problems Unique!
- Complex is DIVERSIFIED: Multiple data sources and format where each piece of the puzzle speaks a different language. This requires strategic data transformation, migration and consolidation.
- Complex is CHANGING: Changes are good but not when they stem out of wavering mindsets or rapidly changing priorities. This induces challenges when the data preprocessing needs to be redone to accomodate for changes in analytics scope and vision.
- Complex is AMBIGUOUS: Unclear goals, unclear asks, unknown destinations or rocky communications are extremely common signs of a complex data problem and this part needs the most time and effort. Leadership often has a blurry idea and they need someone to solidify the vision based on data-driven recommendations and insights. Such problems are not only challenging, but also extremely interesting and provide a base for learning, development and professional growth! One of my personal FAV :)
- Complex is LARGE: We are in an era where collecting information has been simplified beyond bounds. This leads to the issue of being able to filter signals from the noise, aka only focus on relevant data. This further introduces computational challenges, increases processing times and adds to the operational cost.
- Complex is MESSY: Classic messy data examples include incomplete data, repetitive or missing values, unclear column definitions and incorrect values. This can also contain incorrect updates, broken data pipelines, etc.
Hope this helps in understanding the data problems from a holistic perspective, in identifying challenges with the data proactively and in-turn helps in gaining a clearer thought process for the onset of the data analytics process.