Notes from Dr. Gene, No. 3: Model to predict reopening and to conduct follow-up monitoring


On March 25th, I published my first comments and predictions on the coronavirus outbreak. On April 5th, I followed them up with my second notes Since then many people have asked me to publish my third notes on the outbreak. Additionally, many were especially puzzled since my original predictions about the outbreak containment in the New York Metro area were confirmed. They thought it a bit strange that I would not want to “celebrate” such an outcome. I felt differently, as in April and May much bigger questions on when to reopen and how to monitor afterwards arose. Anyway, you asked for it, you got it, so here we go with a quite lengthy Notes No. 3.

During my two-months of silence I was busy assembling an amazing multi-disciplinary team of researchers, physicians, managers, and students. Together we developed a data science breakthrough approach to forecast reopening and conduct follow-up monitoring of COVID-19 fatalities on the federal, state, county, and city levels. This work is our collaborative and volunteer-driven contribution to the ongoing efforts to protect lives and minimize fatalities while steps are taken to restart the American economy.

Given the variations in model-derived predictions, and despite striving towards a consensus, experts often disagree about likely pandemic evolution. Debates on reopening the country continue to intensify and polarize public, business, and governmental decision makers in an already intense election year. It is therefore essential to develop new complementary approaches to pandemic modeling to ease tensions and find an evidence-based, rather than trial-and-error-based, pathway to recovery. This is why we propose democratizing the policy response through a user-centric, SIR-driven (Susceptibility, Infection, and Recovery), and robust approach using Monte Carlo simulations of COVID-19 fatalities, one of the most significant policy determinants.

Two looming questions exist: “When will things return to a new ‘norm’?” and “What is the best way to monitor the rapidly changing situation at regional levels?” To address both, we propose Monte Carlo model to simulate possible ranges of four key customizable parameters: basic reproduction number R0, infection fatality rate, weeks from infection to recovery/fatality, and weekly fatality threshold. The values for the first three parameters are obtained from the SIR disease spread models for New York State, the hardest-hit state. In other words, the outputs from other SIR models are being used as the inputs for our Monte Carlo model. The fourth parameter is obtained from the Centers for Disease Control and is recalculated for diverse population sizes from different regions, including different countries, provinces, states, counties, and cities. As a result, our model enables robust predictions of fatalities and can be used to monitor weekly outcomes.

To mitigate the effects of noise in individual data sources, we build our simulations on raw(!) daily data from three distinct sources: CovidTracking, University of Washington’s IHME, and, of course, Worldometers. We modeled data from 10 U.S.A. States comprising the Northeast and West Coast Pacts. We then tested the predictions on Washington State’s King County and New York State’s Westchester County. Finally, our Monte Carlo model was tested, validated, and implemented by New York University students (taking my course) for all 50 U.S.A. States and 50 different countries, producing conservative estimates. For example, the predicted weeks for reopening were May 25 for Washington State and June 15 for New York State. For details, please visit our New York University’s COVID-19 site

Our approach enables not only the end users—policy and decision makers, healthcare, governmental, and business professionals— but also the public at large to perform three important activities. The first activity is testing different assumptions, by modifying four customizable parameters; the second is to robustly predict reopening; and the third is to reliably monitor weekly fatalities. These three activities enable to mitigate both avoidable fatalities and potential overburdening of healthcare systems.

Per George Box and Norman Draper, “Essentially, all models are wrong, but some are useful.” We did strive to develop a relatively simple, easy to use, and hopefully not-so-wrong, but useful model. We hope our approach can underscore the importance of obtaining more accurate, diverse data for key parameters for future, more advanced and accurate models. Please, feel free to download the Excel file from our site and play with it. The site includes a formal (and long) article, its supplementary materials, guidelines for reopening from different states, the Excel file, and instructions how to use it.

Today, I am happy to report that a more concise version of our paper will be published in the coming August issue of the super-popular Genetic Engineering & Biotechnology News (GEN), per the decision of its one-and-only Editor-in-Chief John Sterling. Finally, let me list all the members of our superb team, all my students, colleagues, and friends:

Alexander Huber (who first suggested using Monte Carlo), Gurkirat Singh Sekhon, Sritham Thyagaraju, Minghao Fu (these top four —Alex, Gurkirat, Sri, and Duke— spent gobs of time developing, modifying, improving, and addressing numerous challenges that the data, model, and I posed), Isaac M. Krasnopolsky, Andrea Davidovich (Isaac and Anya were totally instrumental in numerous aspects of our work, concluding the Magnificent Seven of our initial team), Marita Acheson, Dmitri Adler, Anthony M. Avellino, Philip A. Bernstein, Paul E. Buehrens, Patrick J. Boyd, Drexel DeFord, Yakov Grinberg, Rose Guerrero, Dawn Josephson, Raif Khassanov, Evelyne Kolker, Eugene Luskin, Aliona Rudys, Irine Vaiman, and Aleksandr Zhuk (Marita, Phil, Paul, Patrick, Rose, Dawn, Evelyne, and Alex, who superbly contributed to this work, and Dmitri, Tony, Yakov, Drex, Raif, Eugene, Aliona, and Irene, who significantly improved our team). Of course, we welcome actionable(!) feedback from the users of our Monte Carlo model. Finally, in my next (and, hopefully, much shorter) notes I hope to share some great news. 

I am very grateful to have been a part of this team! As Gurkirat said, it was truly rewarding to work on a tool that is and will be used on a pressing problem that the world is currently facing.

Dawn Josephson

the Master Writing Coach

4 年

This is such important information that needs to get shared and implemented far and wide. Thank you and the team for your work on this.

Gurkirat Sekhon

Senior Product Analyst, ICE

4 年

This paper provides an excellent opportunity for data driven decision making to the policymakers, and can contribute to democratize fatality forecasting, as well as monitoring of outbreak trend using model that is easy to work with, and yet closely emulates the reality. It was immensely rewarding at the academic, intellectual and emotional level to work on a pressing problem that the world is facing right now. More so , because of the incredible team lead by Eugene Kolker, PhD I got to work with which was driven to bring a useful tool for the world to deal with the pandemic.

Duke(Minghao) Fu

Advanced Commercial Data & Consumer Behavior Analyst at Disney

4 年

Thank you for bringing me on board this fabulous team. It was a great time working with you and other team members!

Sritham Thyagaraju

NYU’20 | Business Intelligence Engineer III @ Applied Materials | Data Science, Analytics

4 年

Thank you for the Article, Professor. It was wonderful working with you.

要查看或添加评论,请登录

Dr. Eugene Kolker (Gene)的更多文章

社区洞察

其他会员也浏览了