Predictive Analytics for Preparedness

Predictive Analytics for Preparedness

This brief article is to share some preliminary analysis and reflections of whether data science can improve operational forecasting for humanitarians responding to people displaced by armed conflict in protracted emergency contexts.

A more in-depth article in a publication will likely follow with the full technical background and finalised conclusions. Still, I am hoping that this can be helpful for the broader humanitarian community. (Also, as I am relatively new to data science and programming, grateful for any constructive feedback, and your patience!)

---

Intro: Time for decision sciences for emergency response preparedness?

Despite over a billion dollars of humanitarian assistance delivered in only the last two years of Afghanistan's four-decade-long conflict, our use of quantitative tools for estimation is still minimal. New guidelines for projecting humanitarian needs for overall planning are being released, but these are unlikely to be helpful for operational decision-making.

How many people might we need to respond to in this province next month, given the change in the conflict there? Are we expecting these ongoing skirmishes to trigger a larger-scale displacement? Predictive models for these types of operational issues remain elusive. And yet, teams I managed were exhausted responding to hundreds of thousands of conflict-displaced Afghans in many of the darker corners of the country, frequently with too few staff and resources when it mattered.

Last year, I had the pleasure of working with Alcis, a leading GIS services provider, who had helped develop a machine learning 'image classification' model that could detect and plot tents (their story: https://storymaps.arcgis.com/stories/d85e5cca27464d97ad4c1bad3da7f140), in informal camps that we were responding in. It opened my eyes to reconsidering whether machine learning could support other operational aspects - I had previously considered conflict 'too messy' and complex a phenomenon, with data too biased, incomplete, and slow, to build effective statistical models.

However, as I found, machine learning perhaps values optimising the predictive power of models, over gaining a deeper understanding of statistical relationships and causalities of what is happening. Maybe it could be useful to evaluate the possibilities of decision sciences further?

We tend to know information about the conflict very quickly, but identifying displaced people could take much longer. Even if a predictive model would only be a steer that could marginally improve our operational performance, helping us move an appropriate number of staff or supplies to an area, we should be obliged to explore this further. It could help improve the timeliness of our responses and the overall efficiencies of our operations.

Evaluating different predictive models for province-level forced displacement

I decided to explore this for myself, whether predictive analytics could improve operations. Pulling, merging, and aggregating observational data from the Armed Conflict Location Events Dataset (ACLED) and UN Office for the Coordination of Humanitarian Affairs (OCHA) between Jan 2017 and Dec 2019, I was able to develop and evaluate some different models that might:

  1. Estimate the scale of displacement this month, based upon conflict conditions of the same month
  2. Forecast the level of displacement next month, based upon conflict conditions this and previous months
  3. Pre-empt large-scale displacements (for this, any displacement of over 1,000 people), particularly with simpler heuristics that could help field or emergency response managers

The data itself was highly skewed, and most correlations between different predictive features (e.g. aspects of the conflict, such as number of IEDs, artillery/mortar fire, fatalities) are only slightly correlated with the primary outcome variable (number of displaced people).

No alt text provided for this image

As the data were highly skewed, but the outliers (which represented large-scale displacement events) were crucial to the predictive purpose of the project, statistical analysis would be somewhat challenging. Further, there are higher 'costs' (human, business) in terms of missing larger-scale displacements. As there was an imbalance in the ratio between small-scale and large-scale displacement incidents (about 10:1), performance metrics other than accuracy were also considered. For example, a model could predict there would never be large-scale displacement, and so would be correct/accurate 9 in 10 times, but this would miss the point.

In particular, I developed predictive models using:

  • Supervised machine learning, using multiple linear regression (for estimating overall numbers of people displaced)
  • Time-series analysis, also utilising elements of linear regression, to forecast next month's figures of displaced people
  • Logistical regression analysis, to establish probabilities of large-scale displacement
  • Classification-based decision trees, to provide easy-to-follow 'fast and frugal' trees for field-based decision-makers
No alt text provided for this image

Findings Summary

In short, I found:

  • It is possible to improve the predictive accuracies of overall number of Afghans displaced by armed conflict in the same month, by about 11-16 per cent against a null value (the mean average), and the predictive power improves with larger displacements.
  • The fairly simplistic time-series analysis was insufficient to predict next month's displacement caseloads, upon the basis of current and previous conflict dynamics. This model type's potential might need to be revisited.
  • A 'tuned' logistical regression provided a relatively weak predictive model, although was better than its null or randomised alternatives (out of its entire predictions of large-scale displacements; it correctly predicted two-thirds of the large-scale displacements (and missed a third), although was only correct a third of the occasions in its overall predictions for large-scale displacements.
  • The classification decision tree, when the probability/odds outputs were taken, were seen to be quite interpretable/user-friendly. They could spell out certain conditions about the conflict that - if met - would indicate the chance of large-scale displacement.

Although the data was split into training and testing data for the modelling described above, I have also reviewed some of the preliminary information from January 2020 using the models above. Over this month, there is only one province which observed a large-scale displacement: Nangarhar. The linear regression models predicted 2,410 - 5,700 people displaced given the conflict dynamics (depending on whether population density is adjusted for), whereas a null/mean estimate would only offer 690 people; the actual number of people displaced was about 3,710 during January (although the number had subsequently risen to 5,600 by early February). The classification decision tree would have offered a two-in-five likelihood of large-scale displacement.

No alt text provided for this image

Conclusion: Worthwhile for further exploration

There may be some promise for predictive modelling improving operational decision-making for rapid / emergency response operations in protracted armed conflicts.

Although the models displayed here would need to be periodically re-fitted, they could enhance our overall efficiency, timeliness, and rational objectivity (although would not replace current forms of needs analyses) if incorporated into our operational cycles and contextualised procedures.

I would recommend further exploration, testing, and, potentially, careful piloting in one or more contexts.

Georges RADJOU

BIRD CEO - CONFERENCE CONSULTANT (BIRD IS A UNITED NATIONS REPRESENTATIVE WITH A SPECIAL CONSULTATIVE STATUS)

1 年

Hi, it is very good! I hope that world leaders can read your work more often to give a better future to Agenda for action 2030 in time of several crises. Thanks

回复
Tom Keunen

Enabling integrated security risk management.

4 年

Certainly something to follow up!

回复

要查看或添加评论,请登录

Will Carter的更多文章

  • ?? 1,000,000 lives

    ?? 1,000,000 lives

    That’s how many people we’ve been able to reach in Sudan so far this year. Let that sink in.

    6 条评论
  • Fatherhood in the fourth trimester

    Fatherhood in the fourth trimester

    #fatherhood #paternityleave #dad My partner and I just had our second child. I hoped the second time would be just as…

    103 条评论

社区洞察

其他会员也浏览了