Jupyter Notebook Viewer

2 min read Original article ↗
  1. coviz
  2. main.ipynb
Notebook

There are many great visualizations that show the current number of Covid-19 cases in each US county (e.g. John’s Hopkins, NY Times). But none that I could find show the trend over time in each region.

The plots below show the number of cases over time broken down at the US county level and at the state level. If the plots aren't showing, make sure you're viewing from the nbviewer website and not directly on github.

1. Viewing the plots

In [1]:

from states import plot_data_by_state
from counties import plot_data_by_county

2. Better Understanding the Data

  • Data was taken from the NY times dataset. More details on how the data was collected can be found in the README for that repo.

  • Localities with fewer than 3 days of 10+ cases were excluded.

  • The data is plotted with a fitted exponential curve, which is how pandemics are expected to evolve in the initial stages. The data in South Dakota as of Apr. 8 closely follows this pattern, with case numbers doubling at a consistent frequency.

No description has been provided for this image

  • The data in New York City has a different shape. The fitted exponential predicts a much higher value on the most recent day compared to the actual value. This indicates the rate of doubling is slowing over time.

No description has been provided for this image

  • To view the data without the fitted curve, click “model” in the legend on the right.

No description has been provided for this image

3. Customizing Your Own Plots

The source code for producing these plots can be found on github here.

This project was done in Python3 and depends on the plotly and pandas packages. You can install them using the following:

pip install pandas
pip install plotly

Once you’ve done that, you should be able to clone the repo, make desired changes, and run the notebook on your local machine.

4. Ideas for Improvement

  • Currently the plot is a static html with whatever data was available when the repo was last pushed. A smarter hosting solution could have the data + plots dynamically update when new data is available.
  • Add options to toggle date range and log scale for the charts.
  • Add other data besides number of cases (e.g. mortality rate, ratio of tests coming back positive, etc.)