Stanford study showing widespread infection in Santa Clara county has many problems


You probably saw the headline

related to a recent preprint. The conclusion that many have drawn is that we can conclude that COVID-19 is far less deadly than believed. However, the study is flawed in many ways.

The basic flow of the study was that the researchers posted Facebook ads to recruit participants

Respondents were brought into the lab and tested, with 1.5% testing positive. They then did some statistical adjustments which you can read more about in an analysis by Andrew Gelman here

Here are some of the problems:

Selection bias: many problems with recruiting participants the way they did, two highlighted below

Results are highly sensitive to estimates of the test’s accuracy: This is related to Bayes’ theorem, in short if you are testing for something that is even a little rare, you need to have a far higher test specificity than intuition would suggest. The probability that someone has a rare condition because even a highly accurate test says so can still be very very low.

Borrowing an example from Wikipedia: supposing you’re trying to detect drug use with a test that’s 99% sensitive and specific. If only 0.5% of people are drug users, the probability that someone is a drug user given that they fail the drug test is…….33.2%.

The authors themselves acknowledge that their results would evaporate pretty readily if their estimate of the test’s accuracy is off by even a bit

Statistical corrections seem flawed in obvious ways: Since they knew their sample of participants from Facebook wasn’t representative they attempted to correct for this using a statistical technique. The problem is that their final corrected dataset isn’t corrected for age and is largely skewed as a result.

A few other problems noted here

It’s worth noting that two of the study’s authors penned a WSJ op-ed recently showing their guess that

current estimates about the Covid-19 fatality rate may be too high by orders of magnitude.

Though they found this result, they had to make a lot of basic mistakes to do so.

As Gelman concludes

I think the authors of the above-linked paper owe us all an apology. We wasted time and effort discussing this paper whose main selling point was some numbers that were essentially the product of a statistical error.



Buy Me a Coffee at ko-fi.com