Return to page


Three Ways Data and AI is Helping Against COVID19


By Niki Athanasiadou | minute read | April 01, 2020

Blog decorative banner image

We are in the midst of a global crisis that epidemiologists have warned us about. As of today, 180 countries and sovereign regions have confirmed cases of patients infected with COVID19 (from  here ). Putting aside evidence that indicates the virulence of the disease could be much worse, the fast spread of the virus and the presence of highly vulnerable populations remain valid reasons for serious concern. And while the search for medical treatment and a vaccine continues at a fast pace, data and forecasting tools are proving themselves as important allies, helping us manage the crisis. Below I offer examples of three ways AI and data modeling are being used meaningfully as a ‘force for good’ in the global fight against COVID19.      

Modeling the growth of the COVID19 epidemic and #flattenthecurve

The first step in planning a reaction to a crisis is understanding it. The first line of defense, epidemiological models, have helped from the start to understand the COVID19 infection cycle characteristics and the potential magnitude of the problem. These are models that do not require thousands of observations and are founded on centuries-old established scientific evidence [ ref ]. 

Exponential [ ref ] or power-law [ ref ] models offer straightforward high-level estimations of the cumulative growth of infected patients as a function of time. More detailed dynamic models account explicitly for the observed numbers of susceptible, infected and recovered (SIR) patients, often elaborated by the addition of the exposed individuals group (S E IR) [ ref ]. The parametrization of these models offers insights on several characteristics of the infection, such as lag time and infection rate. 

An advantage of epidemiological models is that they are based on well-understood causal relationships, and ‘what-if’ scenarios can be played out based on factors that are known to be amenable to intervention.  On this basis, it can be shown that reducing the infection rate (by limiting individual exposure) will cause a reduction of the peak number of infected individuals and life-threatening cases [ ref , ref ]. Moreover, the extent of individual distancing needed to achieve any given reduction of the peak of infection can be easily calculated from the model [ ref ]. 

The global campaign for social distancing and #flattenthecurve is now followed by most people in affected countries across the world. One of the latest analyses [ ref ], endorsed by the National Institutes of Health (USA) [ ref ], uses a SEIR model to offer evidence that the strategy of social distancing has been indeed effective in China. The  World Health Organization ,  Center for Disease Control and Prevention (USA)  and many governments across the world are now recommending (sometimes enforcing) social distancing as our first line of defense against the fast growth of the COVID19 epidemic. 

Forecasting and preventing healthcare shortages

The second step in planning a reaction to a crisis is estimating its potential impact. One of the early realizations with COVID19 has been that the growth rate of the epidemic far exceeds the capacity of most local healthcare provider facilities [ ref ]. Even with the most optimistic projections, the relatively low availability of hospital beds and ICU ventilators for this crisis has been alarming. The purpose of  #flattenthecurve is to stretch the duration of the COVID19 epidemic so that the impact on the healthcare system can be absorbed, saving the lives of critically ill patients in the process. 

While we are all doing our best to help the healthcare systems weather the storm by staying at home, healthcare providers are closely monitoring the outbreak, aiming to accurately anticipate needs in the near future and prepare accordingly. is working closely on such an application with Kaiser Permanente [ ref ], and we have released a time series AI solution as a public example [ ref ]. Other organizations and various university medical facilities are taking similar approaches. 

A complementary approach to achieve healthcare readiness was taken by the UTHealth School of Public Health [ ref ].  They used previous-obtained health information to identify those who are at high risk for severe complications from the infection. Having this information allows better estimates on the hospital needs, and can also be a helpful tool in targeting specific individuals for preventative monitoring. 

Bringing data together

Most of the epidemiological data for the approaches mentioned above have been readily available through official sources and public platforms since early in the pandemic’s timeline. This is what has allowed experts and data scientists to offer a variety of approaches and models to help understand the pandemic [ ref ]. 

Another essential contribution of data against COVID19 has been on the exceptionally quick reach of preclinical trials for a potential vaccine. Public databases of known viral genomes have been accumulated over decades through international efforts and previous epidemics. As soon as the first COVID19 genome was published by Chinese researchers on January 10 th [ ref ],  using established algorithms for genome similarity search ( e.g., the Burrows-Wheeler alignment algorithm,  ref ), researchers were able to quickly identify the virus, the first essential step into vaccine development. Moreover, using genomic information and existing work on the virus’s closest relatives, a type of vaccine with much shortened production periods is currently under clinical trials in the US, UK and China [ ref ]. Of course, there is no guarantee that these three vaccines currently in clinical trials will be effective or safe (the two requirements clinical trials are addressing alongside dosage) but the vaccine development process has already begun, a mere 60 days after the viral genome was announced. 


The COVID19 pandemic has placed unprecedented stress on our societies, an epidemiologist’s bad-case scenario that, unfortunately, we have to get through. Data and algorithmic approaches have for the first time come together in a meaningful way to aid in informed decision making regarding our response to this threat at massive scales. Hopefully, on the morning after COVID19, we will still remember the lessons of how AI has aided healthcare at a time of crisis and trust it more when it comes to saving lives [ ref ]. 

Disclaimer: The provided examples and references are intended as further reading to the applications discussed. It is not meant as an exhaustive list of all relevant material.   

Niki Athanasiadou, MRes, PhD 


Niki Athanasiadou

Niki is a Customer Data Scientist at H2O AI with a passion for data-driven knowledge. Coming from a PhD on the microscopic universe of biomolecules, Niki is bringing scientific thinking to real-world big data. Niki has experience in healthcare among other sectors and loves to work in interdisciplinary teams. Her proudest moments are winning the Young Biochemist of the Year award by the British Biochemical Society and the Open Data data-science project award from the Office of the Mayor of New York city. Niki is a fan of Sir Arthur Conan Doyle and Agatha Christie and will never turn down an offer to explore a new art exhibit.