The Effort Behind an Algorithm
Polly Von Dollen
Advancing skills in parenting while figuring out my next career move
We all like the idea of magic. Some powerful spell that can solve our (business) problems for us. Unfortunately this just doesn’t exist. Almost always there is someone, a person, behind that magic, doing a lot of manual work. This is true of the world of data science.
You’ve likely heard about algorithms and machine learning. What you may not be fully aware of is how much manual effort is involved in building a great algorithm. By definition an algorithm is simply a process or set of rules to follow in a problem solving operation.
We often forget how unintelligent computers really are. An algorithm is only as effective as the amount of manual effort put forth up front to determine the best process or set of rules.
As an exercise, let’s examine a decision that we make on fairly regular basis:?
Should I fill my car with gas today or tomorrow?
Take two minutes and write down the steps involved in making this decision. I’ll guess you wrote something along these lines:
Even with just these three seemingly simple questions, there is now a complex web of information needed to make this decision. Check it out:
What’s interesting here is that there are only two outcomes to this decision: Either I get gas today or tomorrow. How I arrive at that outcome can look very different. The scenarios above are just a few of the possible pathways to reach one of two outcomes.
领英推荐
When data scientists are tasked with developing an algorithm to systematically make a decision, i.e. automate it, there is a ton of effort that goes into understanding how that decision gets made. The better we understand how the decision is made the better we can implement an algorithm which will produce the optimal outcomes (assuming the data is available).
There are often many underlying assumptions being made that also need to be addressed. In our example, I’m assuming:
If you’re reading this and metaphorically yelling: “You’re forgetting so many other factors that influence getting gas today or tomorrow!” I’m not forgetting. Even for decisions that only require a simple yes or no require a lot of information. We hope to have all the necessary information but realistically we often don’t.
If a small amount of information is missing we can likely still make the decision and be satisfied with the outcome. When a large amount of information is missing, I might as well flip a coin to determine the outcome.
In addition, some pieces of information are more important than others when making a decision. Knowing how much I will drive tomorrow may not be as important as knowing whether I have enough money to buy gas.
Knowing the possible answers and mapping out the different pathways to an outcome will produce a better algorithm than if we didn’t do this.
For data scientists it’s not only understanding how a decision is made but then also building the connections between information needed for the decision and available data that represents that information. When there aren’t enough data points to represent the information needed for a decision then the algorithm will be sub-optimal.
Agree? Disagree? Let me know in the comments.
Fortune 100 strategy/ops + small biz mentoring = growth!
2 年Love this. Agree with your point about data scientists needing to understand how the decision is made. Data driven solutions need to be built with a clear understanding of the business problem and, preferably, related aspects of the business itself.