Data Preparation is Like Sharpening the Axe
Photo by Matthew DeVries:

Data Preparation is Like Sharpening the Axe

Data analysis is a procedural process that seeks to interpret information conveyed in data. From Data analysis, we can decipher trends, make intuitive observations and statistical inferences, and with good models make near-accurate predictions.

Unlike popular thought, data analysis is a very ancient discipline. In the past, it was heavily dependent on manual procedures of data collection and analysis. Technological advances have provided more efficient and powerful tools that have revolutionized how we interact with data.

Yet, data analysis is still comparable to the age-old tradition of felling trees with an axe. Here is how;

Data analysis involves several steps with data preparation being at the forefront. It is common knowledge, as budding analysts will inevitably discover, that the preparation stage accounts for the bulk of the work of a data analyst. Data preparation can account for up to 60% of the Data Scientist’s work. Despite being time-consuming, tedious, and outrightly unenjoyable, data preparation is the backbone of the entire process.

No alt text provided for this image

We can liken data preparation to the sharpening of the axe. A lumberjack was once asked what he would do if given 5 minutes to chop down a tree. “I would spend 3 of those sharpening the axe,” he said. A haphazard preparation of data will lead to a sweaty analysis full of cursing and regret

A haphazard preparation of data will lead to a sweaty analysis full of cursing and regret.

Do not rush the preparation process. Get all the data needed, and organize it properly before jumping into the analysis. While we cannot spend our entire lifetime sharpening the axe, it will save us much swinging if we have a sharper axe than if we don’t.

This is the best way to look at the preparation process, not as one we hope to quickly do away with but that is ultimately responsible for the kind of results we get. So sharpen your axe adequately. Plan your data collection and do thorough work in making sure you have the right data in the right place. The benefits of a proper preparation process far outweigh the sweat of organizing the data.

Bazil Masabo


