The Actual Highest Number of Daily COVID-19 Cases in the US Is Estimated to Be About 400,000 in April

Total reported COVID infections in the US, 8.3 million as of October 21, 2020, are the tip of the iceberg of 34.7 million estimated total infections. In the UK, 0.8 million cases have been confirmed out of 7.0 million estimated total infections that account for people who were once infected but not tested.

Jungsik Noh
9 min readOct 27, 2020
Source: https://github.com/JungsikNoh/COVID19_Estimated-Size-of-Infectious-Population

According to COVID-19 data repository of Johns Hopkins University, the COVID-19 pandemic has caused 1.1 million deaths all over the world out of 41.2 million confirmed infections as of October 21, 2020. However, we know that many people have been once infected but not confirmed by diagnostic tests. Such cases are called undocumented infections. Here, I am going to explain how the undocumented infections make confusion in interpreting the daily counts of reported COVID cases. I am also presenting my estimates of actual daily cases.

Nationwide Antibody Testing in the US

In July 2020, researchers at Stanford University collected blood samples of 28,503 patients receiving dialysis across the US and performed antibody testing to identify people who once had the coronavirus (SARS-CoV-2). This large-scale survey showed us a real picture of how much the virus has spread in the US, including undocumented infections. The study published in The Lancet reported, “during the first wave of the COVID-19 pandemic, fewer than 10% of the US adult population formed antibodies against SARS-CoV-2, and fewer than 10% of those with antibodies were diagnosed.” In other words, if we apply the seroprevalence rate of 9.3% to the whole US population (328.2 million), the total actual infections are estimated to be 30.5 millions until July, which are much greater than the total reported 3.5 million cases as of July 15, 2020.

The study is one of many experimental evidences showing that many countries have missed considerable COVID infections. Now let’s take a computational approach and look at COVID case data.

Daily Reported COVID-19 Cases

The following chart shows daily reported cases and deaths in the US since March (left and right Y-axes are for cases and deaths, respectively). The numbers were averaged over rolling 7-days to remove weekend effects. Let’s compare the number of new cases and deaths in the early and later. Around April 13, ~30,000 cases were daily reported. To roughly guess the relation between the reported cases and deaths, we may look at the daily deaths three weeks later, which were 1,879 on May 4. Since we don’t intend to accurately compute case-fatality-rates but to compare such ratios at different times, the three-week period may be any other reasonable amount of time to account for the time delay between the infection report and death report. Then, we can roughly relate 16 cases on April 13 to one death in three weeks.

Source: https://github.com/JungsikNoh/COVID19_Estimated-Size-of-Infectious-Population

Doing the same calculation, 57 reported cases on September 28 can be related to one death in three weeks. Much smaller deaths compared to April 13. I want to point out the dramatic difference between the two ratios of cases to deaths in April and September. On September 28, 3.6 (= 57/16) times more cases were related to one death. Why? Was the virus 3.6 times deadlier in April? Not likely. Is it a medical treatment? Since no therapeutic was proved to reduce COVID-19 deaths, the 3.6 times difference is unlikely due to medical improvement. But, it is plausible that there would have been at least a few times more undocumented infections in April than in September, because of the low testing capacity early in the pandemic. Let me provide further evidence.

The next chart shows the same data for the UK. Now we can see a more dramatic difference between the two ratios of the reported cases to deaths in April and September. In September, 6.7 (= 47/7) times more reported cases were related to one death. I argue that this is mainly because there were far more undocumented infections in April compared to September in the UK. Now let me show two other countries showing different patterns.

Source: https://github.com/JungsikNoh/COVID19_Estimated-Size-of-Infectious-Population

The third chart shows the same daily cases and deaths in Israel. Strikingly, 96 reported cases on April 13 were related to one death three weeks later. First of all, the reported cases related to one death on September 28 in the US and UK were 57 and 47, respectively. Then, it is unlikely that the medical capability for COVID patients in April in Israel was twice better than in September in the US and the UK. But it is possible that Israel implemented far more tests and detected the infections more effectively in April than the US and the UK did in September. Any other factors such as age composition in the population, the weather, or cultural difference across the countries would not be able to explain this. Secondly, the 96 cases related to one death in April became 173 cases to one death in September. Again, this increase is likely to be a result of an even higher capacity of COVID testing in September in Israel. Let’s see one last example of a country that seems to have done massive COVID diagnostic tests.

Source: https://github.com/JungsikNoh/COVID19_Estimated-Size-of-Infectious-Population

The last example is Qatar that previously experienced Middle East respiratory syndrome (MERS) coronavirus. The same calculation led to 531 reported cases on May 11 related to one death in three weeks. Later, 407 cases on September 28 were related to one death. It is unclear why the ratios are so high compared to the other three countries. But in this extreme example, we can see that the ratios of reported cases to deaths don’t change much over time when testing capacity has been high enough from the beginning.

Source: https://github.com/JungsikNoh/COVID19_Estimated-Size-of-Infectious-Population

The COVID data of Israel and Qatar suggest that the daily reported cases in the US have been the tip of the iceberg. Based on the comparison of the reported cases to deaths in April and September, it is likely that a higher portion of actual infections was detected in September than in April, as the testing capacity in the US has increased.

The ratio of identified cases to actual cases of an epidemic disease is called an ascertainment rate. I argue that because the ascertainment rates early in the pandemic were very low compared to recent ascertainment rates in many countries, we cannot directly compare daily reported cases back in April with the recently reported counts.

Given that the COVID cases have been under-reported and even the ascertainment rates have changed over time, the reported case counts have limited information on the virus spreading. The reported counts are only useful when looking at their increasing or decreasing trend within a few weeks or a month, because during the short period the ascertainment rates wouldn’t change much so that the trend will be the same as the trend of actual cases. To get a real picture of the virus spreading over the whole period of the pandemic, we have to account for under-ascertainment and undocumented infections.

Infection-Fatality-Rate

How many COVID infections would cause one death on average will be useful to estimate actual numbers of infections, under the assumption that the reported numbers of deaths are accurate. Although a study estimated 28% more deaths due to COVID-19 than the reported from March to May in the US, the reported death counts should be more reliable than the reported case counts which could be only 10% of the actual counts.

The infection-fatality-rate of COVID-19 has been extensively studied, yet researchers haven’t reached a consensus estimate. But many studies have estimated the infection-fatality-rates at the country level to be roughly around 1% (a Nature News article). In particular, a study published in The Lancet reported, “our estimated overall infection fatality ratio for China was 0.66% (0.39–1.33), with an increasing profile with age.” In other words, the estimate suggests that 152 (75–256, 95% confidence interval) COVID infections on average would cause one death in a country. The wide interval of estimation uncertainty is expected to cover the true infection-death ratios of countries having different age distributions and medical capacities. The study also reported the estimates of mean duration from the symptom onset to death and recovery (17.8 and 24.7 days, respectively) based on individually tracked data.

Under-reporting Adjusted Number of Daily COVID-19 Cases

Using the above epidemiological characteristics of COVID-19 and daily reported cases/deaths, I estimated the time courses of actual daily COVID-19 cases in countries and the US states. I developed a machine learning framework that finds the most plausible curve of daily ascertainment rates and time course of actual infections. The data science pipeline also estimates how many individuals are currently infected in countries and the US states (a preprint posted at medrxiv.org). An online repository presents COVID-19 data visualization and daily updated estimates of actual cases and currently infected cases.

Based on the framework, daily ascertainment rates in the US in March and April are estimated to be ~5–10%, suggesting that unfortunately, the US had ~10–20 times more COVID infections than the reported during March and April. In September and October, the ascertainment rates are estimated to be about 40%, showing that more than half of actual infections are still not laboratory-confirmed. The concerning part is that the daily ascertainment rates seem to be staying around 40% since July, rather than continuing to increase.

The next chart presents the under-ascertainment-adjusted actual counts of daily new cases in the US. Due to the low testing capacity early in the pandemic, the actual highest number of daily cases is estimated to be about 400,000 during April 6–13. On October 21, ~60,000 confirmed COVID cases (7-day average) were reported. After under-reporting adjustment, actual daily new cases on October 21 are estimated to be 150,473 with large estimation uncertainty (74,671–254,457, 95% confidence interval).

Source: https://github.com/JungsikNoh/COVID19_Estimated-Size-of-Infectious-Population

Regarding the above other countries, the daily ascertainment rates in the UK are estimated to be ~4% in April and ~60% in September. The ascertainment rates in Israel are estimated to be ~45% in April and ~77% in September.

The following last charts present the under-reporting-adjusted total COVID cases (percentage of the population) for the 50 most infected countries and the US states. As the antibody testing in the US estimated 9.3% of the US adult population was once infected until July, my estimation suggests that several US states experienced >15% cumulative incidence as of October 21, 2020. For more charts and daily updated estimates, please visit my GitHub repository. A preprint manuscript presents detailed data and methods.

Source: https://github.com/JungsikNoh/COVID19_Estimated-Size-of-Infectious-Population
Source: https://github.com/JungsikNoh/COVID19_Estimated-Size-of-Infectious-Population

A tremendous number of people have died from COVID-19. Future infections and deaths are coming. Case counts and death toll show huge variability over the countries and the US states. It tells us that the number of COVID infections and deaths is completely a function of our collective response against the virus spreading.

--

--

Jungsik Noh

computational biologist, statistician, time series analyst.