Poor political predictions deliver analytics lessons for marketers

Poor political predictions deliver analytics lessons for marketers

Failure to predict Trump and Brexit doesn't mean that 'data died'. But there are steps marketers can take to avoid similar black-swan events.

The 9th of November, 2016, is likely to remain etched in the memory of anyone involved in research and analytics. Not only did Donald Trump, contrary to all predictions, win the US presidential election. To make things worse (from an analyst’s perspective), the situation constituted déjà-vu over the failure of surveys to predict the Brexit vote a few months earlier. 

Both of these apparent failures of data carry messages of value for data-led marketing efforts. 

In both cases, most pollsters had foreseen a victory of the political establishment. According to Nate Silver’s FiveThirtyEight-website, for example, Donald Trump’s probability to win was estimated at only 30 percent prior to the vote (a number already much higher than that of other observers). The misses were particularly severe in the states Trump won, where the predictions were off 7.4 points on average. 

These failures led to a number of critics questioning the usefulness of analytics in general. Republican strategist Mike Murphy, for instance, regretted live on MSNBC: “Tonight, data died”. Other experts wondered why we were still “relying on polls to predict election results at all”, or transporting the issue into the business realm, asked whether Trump’s victory signaled “the end of data-driven decision making”. 

Indeed, in front of such fiascoes, it is only legitimate to challenge the reliability of data science. It may work for physics or chemistry, but that does not mean that it can be trusted in or applied to areas such as politics, marketing, advertising, sports, etc. With the advent of so-called big data, we were inclined to believe that human behaviour was predictable—a notion that is now likely to come under closer scrutiny than ever before. 

As we try to understand the reasons for the disconnect between forecasts and actual outcomes, one of the first objections that comes to one’s mind concerns the perception of the methods employed as being unscientific. While purists may argue that “social science” does not qualify as a science in the first place, the one element that makes political or business forecasting questionable from an academic perspective is its lack of transparency and openness. Whereas in traditional sciences, the work of each member of the community is peer reviewed, the models used to produce such predictions are rarely examined by other experts before being published or used in decision making.

Furthermore, as these models are often proprietary or “closed” (think of a black box), it is difficult to build on one another’s contributions, as is the case in other disciplines.[6] 

Regardless of the scientific domain, however, the results yielded by each method can ever only be as good as the data they are based on. If the quality of the data is bad, it would be unfair to blame everything on the technique. Given the involvement of people in the collection and processing of the data, it is possible that the problems were due to human error (during the transcription of subjects’ responses into a spreadsheet, the integration of various tables and files, etc.). 

Taking aside these factors, though, another cause of the incorrect predictions could be a defective sample selection. Indeed, it seems that the polls may not have reached all the likely voters, thus leading to distributions that did not match those of the entire population. When samples do not reflect the true makeup of the electorate in terms of demographics (gender, age, race, education, income, etc.), the models based on these are hardly applicable to the rest of that population. Similarly, problems can emerge when samples are not large enough to stand for the actual citizenry—for example in the case of small cities, which, under certain circumstances, can still make a difference in the outcome of a vote. 

Yet even representative samples can prove insufficient when respondents lie or are embarrassed (respectively, too shy) to express their true intentions in front of a pollster. This can happen when the answer is deemed “unacceptable” by the general public, such as preferring an antiestablishment candidate, or a man, or a white man. While people choose to give a “socially desirable” reply in a survey, things may change in the voter booth, where privacy is guaranteed. This induces analysts to create erroneous models based on biased data that represent an embellished, politically correct, world, while the reality is quite different. 

That said, bias does not necessarily have to originate from the data. In the present case, it appears that the researchers themselves may have been the culprits. Huffington Post, for instance, had forecast a 98 percent chance of victory for Hillary Clinton.[7] Given their fervent support for her campaign, it would not be surprising if they had (inadvertently) refused to question the results of their models, simply because they wanted to see Clinton win. Such confirmation bias, i.e., the “tendency to search for, interpret, favor, and recall information in a way that confirms one’s preexisting beliefs or hypotheses, while giving disproportionately less consideration to alternative possibilities”[8], may indeed have led analysts to ignore some of the signal in the available data, which could have been correct after all. 

To be continued...

---------------------------

Article originally published in Campaign Asia on the 18th April 2017

要查看或添加评论,请登录

社区洞察

其他会员也浏览了