Analytics Education STEM

What-if Models (COVID-19): Models Explained

As a data and business analyst/scientist (what’s my title this week?), a geek, and a caring citizen of the world, I was curiously compelled to track the whole virus spread, its impact around the community, the world, and try to make some sense of it all. Specifically, in the interest of preparedness of ourselves and loved ones.

In some of earlier blogs, I shared some trends of the spread. And I hope to continue to track its progress globally as much as my time allows. However, I was intrigued to find out, with given data on-hand, could we simulate a scenario? A scenario where just one infected person goes about daily lives without any restrictions or awareness, and the impact he would have in the community, the state, the world in this ever-connected Earth?

This is my first attempt in constructing that model. This is based on publicly available data only from official sources, no propaganda, or fake data. There are disputes around the published data even as things are so fluid and dynamic at this time, but I’m going to use the more realistic and conservative figures whenever possible. For each decision, I have explanations and justification of my selections. Cannot say for sure obviously if this is correct…nor do I hope to claim such, this is just a personal experiment to get a sense of the situation that have hit home hard like never before in my lifetime! Take it with a grain of salt. And always refer to official notification, guidance, and resources for your own purpose.

Okay, let’s set up the scenario. 

A person, with a typical lifestyle, and at a prototypical health, infected with COVID-19 goes about his daily life.

My key queries are:

  • How many people could he infect?
  • In what span of time?
  • How many people he infected will infect others?
  • How many people will be infected in total over a given time-span?
  • How many of them will die?
  • What impact (spread) will the original interaction have to the community, the principality?
  • And in what time-span?

In order to answer any of these, I had to gather some data…not imaginary ones, but real-world data. Some of which are trickling down to the public daily (sometimes hourly) and I will have to start with that.


To learn how it affects by age, by regions, the trend to-date, please see my earlier blogs. But here, I will focus on my hypothetical models.

Still here? Good…let’s begin. But first, I have to explain the models envisioned and why. I will start with 3 models, but each of them are easily scalable to any changing parameter as situation requires. Then I’ll explain the parameters used for each model.


Explanation of the models & assumptions

Model 1 – Likely scenario: Interactions for a person/day=22*. Interaction up to 3rd degree. Number of new interactions day over day is 2%. Health assumption: Average health. Incubation period: 10 days. Death rate: 3.75% (average WW). Number of new interactions day over day is 2%. Infection rate=50% (chance of catching it is 50-50). Starting population=1,000,000 **

Model 2 – Best-case scenario: Interactions for a person/day=22*. Interaction up to 3rd degree. Number of new interactions day over day is 2%. Health assumption: Very healthy. Incubation period: 10 days. Death rate: 1.4% (lowest end). Number of new interactions day over day is 2%. Infection rate=50% (chance of catching it is 50-50). Starting population=1,000,000 **

Model 3 – Worst-case  scenario: Interactions for a person/day=22*. Interaction up to 3rd degree. Number of new interactions day over day is 2%. Health assumption: Unhealthy. Incubation period: 10 days. Death rate: 13% (highest end). Infection rate=50% (chance of catching it is 50-50). Starting population=1,000,000 **

Fatality rates vary by model. Not assuming deaths by other causes. Not including births to increase population in the time-period.

All models assume normal healthcare as has been provided with no change in capacity. Also assumes no vaccine, no mitigation implemented including but not limited to: Social distancing, closure of bars, restaurants, events, travel restrictions and gatherings as currently mandated.

After the number of infected people reaches about 100% to total population, the death rate is halved. This is because many of the remaining people will have built immunity over time, and also based on past virus cycles, they tend to break down after seasonal changes. (see my notes in excel file). Again, this is a conservative number.

* Number of interactions on a normal day includes people a person comes in contact from waking up (at home) to transportation, gas stations, lunch, vending machines, cafeteria, interactions at work, public restrooms, drive-through orders, etc. and back home before going to sleep. Each person in return also had similar number of interactions, and each of those again. This grows exponentially, and I stop the interactions for the day after three degrees of interactions to be conservative. The next day, a typical person will also meet the assumed number of people, however, vast majority of the people will be the same people, except for a few new ones (e.g. new person in a meeting, new person in the restroom, new register person, new people waiting at a bus stop, etc. etc.)…I set that number to only 2% conservatively, meaning 98% of the people the person interacts with on a daily basis will be the same people. I consider this metric, in calculating the new number of infected people (i.e. re-infecting the same person is NOT counted, only new ones previously uninfected) day on day.

** For comparison, here the population of selected cities, countries, and states:

Seattle (city) 704,352
Dallas (city) 1,317,929
San Francisco (city) 883,305
Boston (city) 694,583
Minneapolis (city) 413,651
Atlanta (city) 486,290
Miami (city) 463,347
Bahamas (country) 395,361
Alaska (entire state) 739,795
New Orleans (city) 391,495
Vancouver, BC (city) 631,486
Venice, Italy (city) 260,897


Additional notes & sources for justification of parameters

Interaction up to 3rd degree: One person meets about 20 to 25 people per day. But I’ll use the rounded down average, which is 22 to be conservative.  Then each of those people also meet another 22 people. Then each of them will interact with others at the same rate. But I stop there, up to 3 degrees because I want to be conservative and feel within 24-hour period, that’s realistic and not over-blown. While my model allows for this to be changed to other degrees, I’ll use 3 as the default. The reason why I’m limiting it to only 3 degrees within 24 hours is because I want to track the spread of the virus on a daily basis. [1]

Incubation period: The incubation period ranges from 1-14 days. Most common (mode) is 5 days. This is dependent on the incubation period. From the time a patient is hospitalized with a confirmed COVID-19 case, with a period of 1-14 days, the patient will either recover, or die. I am taking the optimistic approach of the person living at least up to 10 days after hospitalization before death is even considered. After which, I apply the death/survival rate as explained. This is an important (and probably generous) part of the model: instead calculating deaths each day and reducing the population by that much, I only reduce the population every 10 days and not each day taking into consideration the incubation + recovery-or-death. [2]

Fatality Rate and Health assumption: The fatality is highly dependent on age, and health condition. However, based on available data, we know that among the healthiest people, the rate is 1.4%, and with comorbid contitions, it’s easily up to 13%. And based on the worldwide data the average rate is 4%. However, it also depends on gender. For male, it’s averaging 4.7% and for female, 2.8%. Therefore, the overall gender-neutral number I’ll use is: 3.75 % [3]

Probability of Infection: Regardless of whether or not an infected person recovers is independent of catching the virus. German Chancellor Angela Merkel fears 60% to 70% infection rate for Germany. California governor Gavin Christopher Newsom stated 56%. I think they may be slightly over-stating, so I’ll use only 50% infection rate. That means only half of the people in direct contact with an infected person will actually get it (again, regardless of their recovery/fatality probability). [4]

Other facts considered but not hard-coded in the models are:

A conservative USA TODAY analysis based on data from the American Hospital Association, U.S. Census, CDC and WHO estimates that 23.8 million Americans could contract COVID-19, leaving almost six seriously ill patients for every existing hospital bed. Another analysis finds that America’s trajectory of community spread is trending toward Italy’s, where circumstances are dire.

One researcher at the Global Center for Health Security estimated last month that as many as 96 million Americans could be infected. The Johns Hopkins Center for Health Security estimated that 38 million Americans will need medical care for COVID-19. The CDC’s worst-case-scenario is that about 160 million to 210 million Americans will be infected by December. Under this forecast, 21 million people would need hospitalization and 200,000 to 1.7 million could die by the end of the year.

Outside the U.S., leaked British documents projected that a coronavirus outbreak could rage until spring 2021.

If I considered 4th degree, the numbers get extremely morbid within 24-hours, so much so that I opted out of publishing them (but my model scales to that, just not published here).


[1]: Data from

[2]: Data from CDC and hospitals reporting to CDC.

[3]: Univ. of Edinburgh, WHO, CDC.

[4]: USA Today, Healthline

With the models explained, I can next share the outcomes of the models. Please see the next post for viewing the results of the models: What-if Models (COVID-19): Results

Disclaimer: This is for hypothetical, informational purposes. However, you are encouraged to share this with whomever may benefit from this or with any data scientist/professional who may want to refute/confirm or contribute to the models.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top