Why we do what we do at smarterscout.com
Daniel Altman
Economist | Writer | Early-stage investor | Executive producer | Founder | Soccer guy
I get a lot of questions about the metrics we use at smarterscout.com, in part because they're pretty different from the ones you might see on other soccer/football stats sites. We have a lot of good reasons for this, so in our spirit of openness I thought I'd share the thinking behind some of our unique practices:
(1) Our most important metrics are based on mathematical models of winning. The team that scores more goals will win a match. For our models of attacking output and defending quality, we use two models that try to connect every action – both attacking and defending – to that outcome. Other types of metrics can be descriptive, but unless they're directly and mathematically connected to winning, we can't say how important they are to that objective.
(2) We think young players are different. We use a completely separate algorithm to flag the young players with the greatest chances of going on to star at a higher level (our smarterscout young prospects). Looking for markers at each position among actions that are persistent over time offers a better indicator than performance in our models. Moreover, there may be more than one potentially successful profile at each position.
(3) We mainly use counting stats to quantify style. Frequencies of actions can tell you about a player's tendencies. But only a mathematical model can tell you each action's contribution to winning. So while the frequency of a player's tackles can tell you a bit about a style of play, they can't necessarily tell you anything about winning until you assign a value to them that's directly connected to scoring or conceding goals.
(4) We rarely use per 90' stats. Getting denominators right is crucial in sports analytics. It's hard to attack if your team doesn't have the ball. It's unusual to defend when they do. So why count attacking and defending actions per 90'? We prefer to use time in possession as the denominator for attacking actions and time out of possession for defending actions.
(5) We use league adjustments. Winning a tackle in the Brasiler?o is not as hard as winning one in the Premier League. A great tackler in Brazil might be an average one in England's top tier. So we use the enormous network created by a decade of player transfers between leagues to calibrate a league adjustment for tackling. But dribbling might be different, and that's why we adjust each metric individually.
(6) We don't use radar plots. Radar plots sure are popular, but they can be misleading. Changing the order of the axes can greatly alter the shape and area of the polygon created by the data, and the eye is naturally drawn to bigger areas. Also, equal changes in the data on one axis can change the area by different amounts. So instead, we use spider (or flower) plots to show players' styles.
(7) We try not to give a false sense of precision. Most of our metrics are listed with two significant figures, because they're based on comparisons of hundreds of players. Because they're not estimates but rather statistics generated algorithmically, they don't have standard errors in the strictest sense. But we display our metrics with different font colors depending on how confident we are; the metrics based on the most minutes played have the darkest font.
(8) We don't have any black boxes. We don't think black-box metrics are easy for our users to trust, nor do we think they're particularly trustworthy. Without a model that makes intuitive sense at its core, a metric is less likely to be consistent and robust from season to season. So our lengthy FAQ describes how we calculate our metrics, and we discuss them on social media as well.
(9) We spend a lot of time on our visuals. Making our metrics beautiful takes more time, but it makes them more memorable, too. We're always trying to think of ways to engage our users by adding an interactive element to our graphics, too.
(10) We don't claim to have the whole picture. If our eyes could tell us everything about a player, we wouldn't need statistics, and vice versa. Even if we prefer our eyes, statistics can still broaden our pool from a few dozen players to a few thousand, then help to filter them down.
So the next time you see someone comparing tackles won per 90' by a defensive midfielder in the Belgian First Division to tackles won by a defensive midfielder in the Bundesliga, take a moment to think about what it means. Would the player from Belgium be able to replicate his performance in Germany? Are these figures even comparable?
We think that the best way to push this conversation forward is to talk about our methods. You can sign up and read our FAQ to find out more about all of our metrics. And the best part is that it's free, as are our metrics from 27 top-tier leagues around the world. We hope you enjoy!
Owner/Executive Director at iplayformance.com
5 年Well done Dan, The explanation is to the point, as the service you all provide.
Transforming Lives at Turf Season
5 年Hi Dan, you're really doing a great job. We talked about Turf Season last year. I have some updates for you: bit.ly/turfseasontv