Wisdom of Crowds

A few years ago, I was in a math tutorial class with my son. The teacher asked each child in the class to estimate the number of pebbles in a glass jar and wrote the estimates on a screen. The children’s estimates spread between 10 and 20. In the end, the teacher counted the pebbles in the jar and provided the actual number. I was curious about the exercise and asked about its purpose. The teacher explained it was intended to develop the sense of numbers. I was a bit disappointed with the answer, as I thought the exercise was to demonstrate the wisdom of the crowd: the mean of all children’s estimates will be more accurate than a typical estimate.?

This phenomenon was first discovered in 1907 when Francis Galton asked 787 villagers to estimate the weight of an ox. The mean of all estimates was 1200 pounds, just 2 pounds off the actual weight. Similar results have been found for many other questions, such as predicting the temperature in a week, estimating the distance between two cities. Except for the questions requiring special skills or expertise, averaging the estimates from a crowd generally improves accuracy.

The process of making an estimate or judgement involves looking for cues in evidence, matching the cues with past experiences, proposing and selecting an estimate that makes the most sense of coherence. The process is subject to cognitive biases such as substituting the difficult questions with easy and accessible ones, estimating under the influence of moods, and confirmation bias. These factors lead to variations of estimates among different people and repeated estimates from the same person.?

From statistical perspectives, the error of an estimate contains two parts: bias and random variation. The random variation is also called noise, the inconsistency of estimates under the same situation. When random variation is removed, the remaining difference between the estimate and the true value is the bias. When multiple estimates are made independently, using the mean value of the estimates will reduce the error variance in reverse proportional relationship to the number of estimates. For example, the mean of 100 independent estimates will reduce the variance to 1% of the average variance of each estimate; the size of variation of the mean estimate will be reduced to 10% of the average variation because of square rooting.?

As people with diverse backgrounds tend to make independent estimates, the diversity of the group has the benefit of generating less noisy estimates by taking the average of all estimates. If it is not easy to find such a group, there is an alternative approach. After you made the first estimate, ask yourself what factors can make your first estimate wrong. Take some time and then make your second estimate. The mean of the two estimates will improve the accuracy as well.?

Note averaging estimates does not reduce the bias error. When the variation is reduced, it becomes easier to see the bias, investigate the cause, and adjust future estimates to compensate for the bias.?

In today’s data analytics and machine learning age, the idea of the wisdom of crowds resembles the method of aggregating multiple forecast models to achieve a better result. There are other factors to be considered:

  1. Although it is simplest to use the average (mean) as the aggregation method, it is generally a weighted sum method where each model output is multiplied with a weight factor.
  2. Forecast models can be of different types such as linear regression models, decision trees, etc. With each model type, multiple models can be created using different sets of parameters or data subsets. Generally, models with the same types will have more correlated outputs than those from different types of models.

When forecast models are assumed to be unbiased, it was found that [2]:

  1. The optimal weights should be inversely proportional to the model error variances. Apparently, more weight should be applied when the model is more accurate.
  2. When the number of models is small, each model’s accuracy is more important than the model output’s covariance within each type. As the number grows bigger, the reverse holds true. This indicates that for model diversity is more important than accuracy when aggregating a large number of model outputs.

When forecast models are biased, above two finding a) and b) still hold. In addition, it was found [3]:

  1. Individual biases of forecast models always remain important regardless of the number of models being aggregated.

These results are shown by modeling each forecast as a random variable and calculating the expected error of the aggregated forecasts. The covariance between random variables describes the correlation of forecasts. When covariance is small, the forecasts are less correlated and have higher diversity.

These results provide guidelines for aggregating a group’s judgement or forecasts. Regardless of group’s size, aggregation can reduce variation but not biases; as group size increases, it is more important to focus on the diversity of the group and independence of forecasts rather than the accuracy of each forecast.

Reference

[1] Noise: A Flaw in Human Judgement. Daniel Kahneman (Author), Olivier Sibony (Author), Cass R. Sunstein (Author), Jonathan Todd Ross (Narrator), Random House Audio (Publisher), chap. 21.

[2] Optimal Forecasting Groups, PJ Lamberson, Scott E. Page, Management Science, 58(4):805-810. https://www.researchgate.net/publication/261974197_Optimal_Forecasting_Groups??

[3] The Composition of Optimally Wise Crowds, Clintin P. Davis-Stober, David V. Budescu, Stephen B. Broomell, and Jason Dana,? Decision Analysis, 12(3): 130–143. https://www.researchgate.net/publication/276269014_The_Composition_of_Optimally_Wise_Crowds?



要查看或添加评论,请登录

Hongbin Li的更多文章

社区洞察

其他会员也浏览了