Settings

Theme

Spreadsheet of San Francisco Bay Area Covid-19 Data and Charts

docs.google.com

28 points by andfrob 6 years ago · 48 comments

Reader

pkaye 6 years ago

California unfortunately has a huge backlog of pending test results. The cause seems to be the private labs (Quest in particular) accepted test samples and build up a huge backlog of the earlier manually processed test samples. Other labs would push back if their queue got too long. The newer samples are run on the Roche high speed machines.

Jommi 6 years ago

Where is the most important metric? Daily tested vs tested positive stats.

  • colechristensen 6 years ago

    I don't believe this is a reliable metric.

    Who gets tested is a moving target. Stanford a short time ago did a free-for-all testing binge in order to collect data, but finished that and is now restricting tests to people requiring specific risk factors to give a test.

    The first time I tried to get a test from another provider I just wasn't able, they didn't know of anywhere that would test me outside of hospitalization-type symptoms.

    So testing is uneven and not very available, any stats need to include some metric for the criteria to get tests in the first place.

    In other words, there is likely an enormous population with no symptoms or mild symptoms who couldn't get tested if they tried.

    After two video appointments with separate providers I was able to get tested yesterday and the result came back negative about 22 hours later. It took me about 8 hours of effort and time to get that done, a luxury many people do not have.

    • majormajor 6 years ago

      > After two video appointments with separate providers I was able to get tested yesterday and the result came back negative about 22 hours later. It took me about 8 hours of effort and time to get that done, a luxury many people do not have.

      Is there any value in people self-selecting into personal choice testing? You could get infected tomorrow, for instance...

      If we wanted a full picture of community spread we'd need a top-down random sample, not self-selection, no?

      • colechristensen 6 years ago

        Aren't there ways of turning self-selection populations into random sample populations for statistical purposes? (It has just been a while since I have had to think of these things).

        But really we want more than just accurate statistics, we want to minimize damage. Any increase in testing is good testing, and triaging testing to highest risk individuals makes sense when your capacity is limited.

        The consequences though are that reported statistics are often just wrong. Skewed towards higher negative outcomes and comparisons between dates are flawed without much additional information.

    • Jommi 6 years ago

      It's one of the most reliable metrics we have, and a lot better than just tests or just confirmed cases.

      After you have this info, you can compare the rates to what kind of testing policies the areas have, and make some initial conclusions.

  • corysama 6 years ago

    The one I pay attention to is daily growth rate of confirmed cases. It can't cover people who aren't tested. But, it approximates the velocity of the problem's magnitude. And, over time it shows the acceleration --which reflects on how we are improving the situation, or not...

    https://paroj.github.io/arewedeadyet/#rate

    The good news is that the US has gone from a 30+% daily growth rate 10 days ago down to a 15% growth rate and falling. We need to keep falling into the negative rates to solve this problem.

    • xivzgrev 6 years ago

      me too. it has some noise from variation in testing rate, but it's directionally accurate.

      for example the bay area has recently been seeing some days with single digit growth rates. shelter in place IS WORKING, but it's going to take time / we may need some additional measures. I was just reading it may also be spread in the air from breathing.

  • samcheng 6 years ago

    Honestly, the most important metric is deaths, and from what I can see, the SF Bay Area has done relatively well in that metric. No overcrowded hospitals, for example.

    • calebsurfs 6 years ago

      To me the hospitalization rate is most important.

      * Overcrowded hospitals is what leads to large jumps in fatality rates.

      * It only lags the date of infection by about a week.

      * It also isn't subject to external factors like availability of tests. (Though availability of hospital beds is a factor later on)

      • Jommi 6 years ago

        Yes, this is very important. But one should not forget also the avg. hospitalization time (which will go down once we have clear procedures for treating COVID in different stage)

      • andfrobOP 6 years ago

        Agreed! Still not available much on a county level in the San Francisco Bay Area :(

    • Jommi 6 years ago

      In the end, yes deaths are most important, but in order to make temporal decisions that affect that number, we need infection-related data first.

    • xivzgrev 6 years ago

      yes, but it lags 2-3 weeks behind confirmed cases

  • andfrobOP 6 years ago

    This is not released by most counties, unfortunately.

    There are some limited stats to the very far right side of the "SF Bay Area Actuals" sheet.

    Anecdotally, Bay Area is seeing <10% positivity rates

    • andfrobOP 6 years ago

      Another important data point to assess testing is Case Fatality Rate (CFR). This is about 2.5% in the SF Bay Area.

      In other places with higher testing, such as Australia, the CFR is 0.6% or less. This implies that the true number of cases is 4-5 times higher... probably a lot more.

      • colechristensen 6 years ago

        It seems like this disease is so successful because of a significant symptom-free-but-contageous period followed by a small percentage of very serious symptoms.

        That's what a pandemic needs. If it is very deadly very quickly it kills its transmission vectors before they can transmit. If it is entirely symptom free, it is very evolutionarily successful, but no one cares because there aren't any negative effects.

        There is an "optimum" of disease characteristics for maximum damage and we seem to be experiencing one.

        The bottom line is that it seems to be very difficult to prevent a majority of the world population from getting this disease and the result is going to be a global fatality rate of somewhere in the neighborhood of 1%.

      • mcguire 6 years ago

        The Diamond Princess numbers are 11 deaths out of 712 cases, with 82 still outstanding (15 serious or critical).

        The CFR should end up being about 1.5% (or possibly somewhat higher).

        • lokl 6 years ago

          Cruise ship passenger demographics might not be representative of the general population.

      • wool_gather 6 years ago

        It's currently a little under 5% worldwide. There are on the order of 1 million cumulative cases and a bit under 50 thousand deaths.

        https://www.who.int/emergencies/diseases/novel-coronavirus-2...

        Doesn't account for lack of testing, of course.

        • svachalek 6 years ago

          That also doesn't account for the exponential growth in number of cases; the people dying now are out of a much smaller cohort of confirmed cases in the past.

          Deaths / (Deaths + Recoveries) would be more like it, and that's a scary number.

          • wool_gather 6 years ago

            That's a very good point; unfortunately the WHO doesn't have a "closed case" statistic that I can see.

danans 6 years ago

The conspicuous lack of realistic infection data from India, coupled with the extreme challenges to containment and control there (just due to the sheer crowding) is frightening, regardless of whether the poor data is intentional or just because India is hard place to coordinate.

That the published infection and mortality rates are so low strains credulity in the extreme, especially when much smaller-population countries at similar proximity to the equator but greater distance from China have higher case rates (i.e Brazil, Ecuador, the UAE).

andfrobOP 6 years ago

I developed this for myself but data junkies trying to get a feel for what is happening with the coronavirus spread across the San Francisco Bay Area will appreciate it.

I am updating it regularly.

  • pkaye 6 years ago

    Where are you getting the raw data? I'm extracting it from the New York Times dataset for my own graphing. They have the data for all counties in the US. I've been meaning to automate the graphing but for now doing it manually.

    I wish you had the new cases per day graphed for all the bay area counties because that is what I monitor.

    • andfrobOP 6 years ago

      Raw data was originally from SF Chronicle, but they removed their timelapse view so I am now getting it direct from county websites. Stanford Open Data project also has a reasonable historical dataset that comes from the county websites.

      I'll add a new cases graph for each county.

denster 6 years ago

@andfrob, just saw your comment about SF Chronicle removing their timelapse view.

We made one here from the NYT dataset on MintData [1]:

https://nyt-map.covid42.com/

(note: I think we need to update the cumulative counter, we'll be fixing that shortly)

@andfrob happy to get you free/unlimited access to MintData if you're interested in making similar visualizations, please DM me if this would be helpful.

[1] https://mintdata.com

testfoobar 6 years ago

Is anyplace in the Bay Area sharing stats by zipcode?

For example, San Diego has zipcode breakdown here: https://www.sandiegocounty.gov/content/sdc/hhsa/programs/phs...

norifukuoka 6 years ago

Very cool. By the way did you intend for the Y-axis on the "Days since 100 cases" chart to be "Days since 100 cases"? It seems like the Y-axis is "cases" and the X-axis is "Days since 100 cases".

the_crocodile 6 years ago

Very helpful. Thank you for sharing!

Have you been able to find data on # of tests carried out?

  • andfrobOP 6 years ago

    Very, very limited data on the Bay Area. Under the "SF Bay Area Actuals" you can scroll all the way to the right you will see what I have been able to find.

    California does report them on aggregate, but the purpose of this sheet was to focus on the Bay Area.

Cactus2018 6 years ago

Is anyplace in the Bay Area sharing stats by age brackets?

starpilot 6 years ago

Design an evacuation plan for San Francisco. You have 15 minutes.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection