COVID-19 infection growth rates, lagged mortality rates, and other interesting statistics

Update 3/11/2020: I’ve posted a follow up that uses the background laid out in this article to create a statistical model that estimates the future number of coronavirus cases in the United States.
Update 3/18/2020: I’ve added a second follow up using more recent data to project growth globally.
Update 4/10/2020: Recent data indicates that the US is on a tragic path toward experiencing more infections and mortalities from coronavirus than any other developed country on the planet.

The Center for Systems Science and Engineering at John Hopkins University recently released a really useful dashboard to track COVID-19’s spread throughout the planet. The service is built using data from multiple sources, including the WHO, CDC, ECDC, NHC, and DXY. Thankfully, they’ve also made that data publicly available on Github. This analysis dives into that data to see what we can learn about the virus’s growth and development.

But first, let’s be clear — if you’re looking for the latest guidance and news on the novel coronavirus, please refer to the Centers for Disease Control or your national public health institute. The following is a moment-in-time analysis of the coronavirus as of March 4th, 2020. This research is provided to the public strictly for educational and academic research purposes. More than anything, I wanted to look at the data myself rather than relying on the too-often hyperbolic news cycle. I also wanted to see how the mortality rate differs between countries, and if I could use some simple math to estimate the total number of unconfirmed cases of COVID-19 in the United States.

Image for post
Image for post

So where are we?

At the time of this writing, there are just over 95k confirmed COVID-19 cases globally, with 153 here in the United States. Over 51k people have recovered from the illness, and about 40k are in treatment or otherwise have an existing but unknown case status according to the data shared by John Hopkins.

The WHO Director-General reports that the latest global mortality rate is 3.4 percent of reported cases. It’s not clear exactly how they’re measuring mortality rates, but it sounds like they’re taking the crude approach of dividing the number of deaths by the total number of confirmed cases. Doing just that corroborates the 3.4 percent mortality rate reported by the WHO.

COVID-19 Global Summary StatisticsMortality Rate:      3.42% 
Already Recovered: 53.79%
Still in Treatment: 42.78%

I find it rather interesting to see how that value has changed over time. If you’re not frequently dealing in the world of probability, it’s probably weird to think about a mortality rate changing — that’s largely due to how it’s being measured and how many individuals are being tested and confirmed as having the virus. The mortality rate for COVID-19 may be lower if there are many people with mild symptoms who are just staying home and not getting tested, or if certain segments of the population haven’t had the opportunity to get tested — we’ll see how this plays out further below.

Image for post
Image for post

If we think about the lagged effects of infection, that is, the time and steps between getting infected, showing symptoms, getting treatment, and the eventual health outcome — it’s reasonable to wonder if building in some type of lag to the denominator is a better way of approximating mortality rates.

To do this properly we would need to have the [anonymized] underlying data about specific individuals to track start and end dates, which unfortunately we don’t have. But if we improvise by lagging the denominator to evaluate the death rate by the number of people that have been confirmed in the previous 1–7 days, then we may get a decent enough approximation of what that lagged mortality rate looks like.

Image for post
Image for post

The plot above suggests that the effective mortality rates were considerably higher when the disease started getting tracked, likely fueling anxiety around the world. Thankfully, however, the lagged mortality rates have started to converge as time passes. Hopefully, it stays that way.

Either way, the problem with a global mortality rate is the inherent bias in how each country tackles the disease — particularly in testing and containment, as well as the general population’s access to health services. If we rank countries by the highest mortality rate, we see that as of today, the United States has the highest crude mortality rate from COVID-19 at 7.19%. This is likely because tests were restricted by the CDC and not enough people have been tested for the virus — though to their credit they’ve since lifted those restrictions. Either way, this is concerning, particularly given the state of healthcare in the United States.

Image for post
Image for post

From what I’ve seen, I expect cases of coronavirus in the US to increase considerably with expanded testing, bringing down the mortality rate with it, but unfortunately due to an increase in the number of individuals affected. In fact, if we use the global average mortality rate of 3.4 percent and use that to estimate the number of coronavirus cases in the United States given the 11 mortalities to date— that results in an estimated 324 COVID-19 cases in the US, of which only 153 have been confirmed and an additional 171 would be unconfirmed cases in the general population. Sadly, this approach suggests that as deaths increase, so will the estimated number of unconfirmed cases.

How to calculate these values:0.0719 = 11/153, where 0.0719 is current mortality rate in the US, with 11 deaths and 153 confirmed cases.0.034 = 11/x, substituting 0.0719 with 0.034, the global average mortality rate, and x is the unknown variable.x = 11/0.034 ≈ 324Author's note: The above values are purely speculative estimations using simple mathematical modeling and are not confirmed by the CDC nor any national public authority. 

When the CDC says that they “expect more cases to be detected across the country”, it’s probably because someone at CDC has done this type of modeling to estimate the possible number of unknown cases out in the general population. That said, I hope it’s more rigorous and precise than the simple math above.

I would also note that many countries haven’t had a single death from the virus. South Korea, with the second-highest number of confirmed cases after China (5621), has the lowest crude mortality rate of all the countries with at least one death. This is likely a function of more people in South Korea with mild symptoms getting tested and confirmed, which may actually be the result of the country having free universal healthcare. We don’t have sufficient contextual data nor the experimental design at this stage capable of claiming causality, of course, but it’s an interesting anecdote nonetheless.

Indivual Country Mortality RatesCountry              Rate  Cases  Deaths
US 7.19% 153 11.0
Iraq 5.71% 35 2.0
Australia 3.85% 52 2.0
Mainland China 3.71% 80271 2981.0
Italy 3.46% 3089 107.0
Iran 3.15% 2922 92.0
Thailand 2.33% 43 1.0
Japan 1.81% 331 6.0
France 1.4% 285 4.0
Spain 0.9% 222 2.0
South Korea 0.62% 5621 35.0
Note: The Philippines with 3 confirmed cases and one confirmed death is an outlier and excluded from the table and chart above.

No longer in a low-growth world

The growth of the COVID-19 virus is staggering. From January 22nd to the time of this writing, the median daily rate of new cases is 5.6 percent. More startling is the median daily growth in deaths at 9.3 percent. That said, it’s comforting to see that the median recovery growth rate of 16.2 percent is higher than the combined new case and mortality growth rates.

COVID-19 Median Daily Growth RatesCase Growth Rate             5.6%
In Treatment Growth Rate 4.1%
Recovery Growth Rate 16.2%
Mortality Growth Rate 9.3%

When we think about these values, what should we be looking for? Intuitively, the spread of the disease is contained when the rate of the in treatment population recovers much faster than the rate of new cases. In other words, we want increases in the green line and decreases in the orange/yellow lines below.

Image for post
Image for post
Dotted horizontal lines are medians.

It’s impossible to predict how this may unfold.

Stay safe out there.

The analysis provided in this article is strictly for educational and academic research purposes. The work relies upon multiple publicly available data sources aggregated by the Center for Systems Science and Engineering at John Hopkins University, which they indicate, “do not always agree”. The author hereby disclaims any and all representations and warranties with respect to the analysis, including accuracy, fitness for use, and merchantability. Please refer to the Centers for Disease Control and your local health experts for the latest guidance and updates on COVID-19.

Written by

Founder, CEO and Chief Scientist @ Invariant Studios

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store