More and more academic and industry centers and research groups are offering forecasts for the pandemic. The Centers for Disease Control and Prevention features forecasts from nearly 40 such teams on its website. Our own research team, funded by the CDC, draws on these to assemble an aggregate “ensemble” forecast, also posted on the agency’s site, that combines these predictions to look four weeks into the future. The individual forecasts differ because they use different data sources and methodologies. The ensemble, by combining predictions made using disparate approaches, creates a forecast greater than the sum of its parts.
Currently the ensemble model predicts that there will be about 212,000 covid-19 deaths in the United States by Columbus Day weekend next month, with a 10 percent chance of there being fewer than 208,500 total deaths observed by then, and a 10 percent chance of more than 216,100. Overall, the collective forecast predicts that we will continue to see between 3,400 and 6,000 weekly deaths for the next month. (Deaths now stand at about 195,000, according to Johns Hopkins.)
Precisely how many weeks into the future coronavirus forecast models are reliable is an open scientific question, yet it’s one we’re tackling. Since 2019, our researchers — at the University of Massachusetts at Amherst and Johns Hopkins University — have worked together to study the accuracy of forecasts first of influenza and subsequently of the coronavirus. Our conclusion so far is that forecasts beyond four weeks are either so inaccurate, or have such a wide range of possible outcomes, that they are useless for personal decision-making or policymaking. Just as you wouldn’t use a prediction of rain six weeks out to determine plans for a trip to the beach, neither ordinary citizens nor lawmakers should be looking at pandemic forecasts past four weeks into the future as they make personal decisions (about taking a vacation, say), let alone policy decisions (such as opening schools).
It is important to distinguish between different types of outbreak models. “Forecast models” — the ones we aggregate on the CDC site — are typically designed with a single goal in mind: to make a specific, quantitative prediction about an event that will be observed in the future. If these models are reliable, they can be used to help decision-makers plan for staffing needs at health-care clinics, to inform vaccine manufacturers’ decisions about where to run vaccine trials and to guide officials on where to send additional diagnostic testing supplies.
Other infectious-disease “scenario models” are designed to explore multiple “what if” hypothetical futures. These models can answer questions such as: How would the case rates change if nursing home staffers were tested twice a week compared with once a week, all else remaining equal? In such a model, the total number of cases is less important than the changes from a baseline. They can help determine what countermeasures might be effective at controlling the spread of a disease; this type of long-term forecast is less problematic.
In the case of forecast models, however, accurate numbers, not just relative comparisons, are paramount. And in achieving the goal of maximum accuracy, research — involving forecasts for real-time outbreaks of influenza, dengue fever, Zika, Ebola and other diseases — has shown that aggregating multiple models is the right approach. We’re seeing that’s true for the coronavirus as well.
By now we should know that long-term coronavirus forecasts leave much to be desired. In early April, for example, the United States had seen fewer than 5,000 deaths from covid-19. It was not clear to many just how disruptive the pandemic would become. The University of Washington’s IHME model — the same one that made news last week — became avidly watched. Depending on the day you checked in April, IHME said the total number of deaths would reach 60,000 to 80,000 in the United States. Other models cited by the White House Coronavirus Task Force showed much larger numbers — up to 240,000 total deaths. These fluctuations and variations in the models led top decision-makers, including President Trump, to offer alternatingly optimistic and pessimistic responses. After the IHME model’s number fell at one point, Surgeon General Jerome Adams said he “absolutely” expected the death total to be lower than anticipated. After the 240,000 figure came to light in late March, Trump said the United States would be in for “a hell of a bad two weeks,” dropping his optimistic outlook, however briefly. Notably, the early IHME forecasts also said that essentially no deaths would occur after early June, failing to capture the protracted nature of the crisis.
Earlier, a “what if? modeling study from Imperial College London, released on March 16, suggested that the pandemic would be likely to cause more than 2 million deaths in the United States if zero preventive measures were taken. (Trump has declared his efforts to be triumphant in part because are not on that trajectory.) Decision-makers and the general public were not equipped to effectively interpret these widely disparate results from different kinds of models.
The aggregate forecasts posted by CDC each week are useful, especially when capped at four weeks out, in offering a sense of whether public policy is succeeding or failing at curbing the coronavirus. Policymakers can use the data to help them decide whether to gently continue reopening or to tap the brakes, banning some activities.
Longer-term projections from hypothetical-scenario models also serve a purpose: They can offer guidance about which policy tools might be most effective. When talking about hypotheticals, stretching the timeline by a few weeks or months can make sense.
The problem comes when the public, press and policymakers interpret predictive models as saying something definitive about what the world will look like in December or January. The pandemic is shaped by human behavior, and too much can change between now and then to be knowable. Problems also ensue when hypothetical-scenario forecasts are confused with predictive forecasts.
Educating people about the benefits and limitations of coronavirus forecasts can help us fight the pandemic. If a scenario model tells us that we’ll see a rise in cases if we open bars and gyms, we should pay heed. If a pooled set of forecast models shows that viral spread will probably decline in the next two weeks, then we can be somewhat confident about our short-term public policy. But let’s not scare ourselves with predictions about what’s going to happen in three months — because nobody knows.