The Importance of Small Early Wins (SEWs) in Data Science Problems

The Importance of Small Early Wins (SEWs) in Data Science Problems

Data Science problems can be problematic in the fact that they often rely on the availability of data, the applicability of the data and the end game or what can be done with the conclusions of the data. The availability of the data concerns whether there is data that is readily accessible in a database, can be constructed in survey answers or constructed in another manner. The applicability of the data concerns whether this data is truly applicable or can be used in a proxy for the situation. Take for example, a data science problem that seeks to determine who is the best running back in say the National Football League (NFL).

The data is readily accessible from many websites and sources. The applicability comes into play when examining whether the best running back is the back that has rushed for the most yards. The measure of total yards from scrimmage could be used or a different measure of total yards from scrimmage that encompasses rushing yards and receiving yards. In the 90s, the Detroit Lions running back Barry Sanders would have been considered the best running back when using the most rushing yards, yet Buffalo Bills running back Thurman Thomas would have been considered the best when using the total yards from scrimmage measure.

The end game, in the running back comparison would be which back would a coach take in a scenario where both backs were available in say free agency, would be the decision in which the inference where the end game would be superior. In this end game scenario would also take into consideration external factors such as offensive strategy, overall team strategy and other individual attributes such as rushed for negative yards, fumbles, and other players. A team such as the Bills may have favored Thomas due to their pass heavy attack that required their running backs to catch many passes out of the backfield whereas the Lions that featured less capable quarterbacks may have preferred the generally more elusive Sanders. In the SEWs scenario, a team that is just starting out as an expansion team would likely prefer Sanders, an elusive player that that would likely be able to win more games based on their running ability alone. In this essay, a comprehensive look at the effectiveness of QEWs in data science problems.

??????????????? Karl E. Weick examined SEWs as he stated that, “a small win is a concrete, complete, implemented outcome of moderate importance...one small win may seem unimportant. A series of wins at small but significant tasks, however, reveals a pattern that may attract allies, deter opponents, and lower resistance to subsequent proposals”?(Weick, 1984)

??????????????? Data science decisions can seem counter-intuitive or decisions that can seem illogical to the reasonable employee, manager, or senior executive. From the running back example, it may seem illogical to not say draft Sanders or Thomas, but to instead take a back that is a larger back that fumble less, has no rushes for negative yards or provides little in the passing game. However, if the back selected fits the scheme of a power rushing game and the overall strategy of ball control, tough defensive and low scoring affairs may be the better choice.

Further Weick mentions that William Ruckelshaus, an administrator of the Environmental Protection Administration (EPA), “identified quick, opportunistic, tangible first steps only modestly related to an outcome. The first steps were driven less by logical decision trees, grand strategy, or noble rhetoric than by action that could be built upon, action that signaled intent as well as competence.”?(Weick, 1984)

Relating Weick and Ruckelshaus to data science problems, these problems must be initially identified by actions for which an outcome can be identified. In the football example, the outcome is winning football games. What are the variables that lead to this outcome? In my research, turnovers and a combination of overall offensive yards are the variables that lead to the most wins. Thus, we must find the best players that lead to these outcomes. In this case, we find that we need players that produce the most yards without causing turnovers, i.e. we need running backs that get the most yards per carry and do not fumble and quarterbacks that throw for the most yards per attempt and dop not throw interceptions.

In conclusion, we see that in order to utilize SEWs in data science problems, we need to isolate the factors or variables that are necessary for success and potentially allow for tangential failures in the process. Just as a “three and out,” without a fumble or an interception may not lead to an immediate success or points in a football game, a comprehensive offensive attack that consistently leads to yards without turnovers is paramount in winning the game or the outcome.

Works Cited

Weick, K. E. (1984, January 1984). Small Wins, Redefining the Scale of Social Problems. American Psychologist, pp. 40-49.

Matthew Reese

Data Scientist | Artificial Intelligence Developer | Machine Learning Practitioner | RPA Developer | Alteryx? ACE!

5 个月

Seven minute survey for my Quicksand Project! https://forms.gle/8DKV6qNtnhpeMYtT9

回复
Matthew Reese

Data Scientist | Artificial Intelligence Developer | Machine Learning Practitioner | RPA Developer | Alteryx? ACE!

5 个月

If you like my latest article, please help me out some more by completing my survey for my Quicksand Project! https://forms.gle/8DKV6qNtnhpeMYtT9

回复

要查看或添加评论,请登录

Matthew Reese的更多文章

社区洞察

其他会员也浏览了