Geeking Out Part 2: Nate Silver Followup - More on Vax Rates, Covid Death Rates, and Correlations
The data were RIGHT THERE!
This is the part 2 I promised in Geeking Out: Replicating Nate Silver's COVID & Partisanship "Work". There may be further sequels, as it were, but there is no schedule for that right now.
I am not quite sure what Nate Silver was aiming at, frankly, mainly because I was distracted so much by so many bad choices made in the simple model he had.
I don’t mind simple models at all — some of my biggest successes were with taking somebody’s overly complicated model and stripping away frivolities. One of the proudest models I did was look at somebody’s bizarre speech evaluation approach, threw that away, and then replaced it with a linear model. That worked.
This is all to say, there was some data right in front of Nate Silver in one of his sources, and there is another that is not all that difficult to use… if you know how to use it.
Goal
Before I begin, let me be clear as to what I’m trying to do: look at what connections, if any, there are between vaccination rates and COVID death rates.
I am keeping this simple still by doing this at a state level.
I don’t care about any partisan link - not in this post. Maybe in a future one. The only point here is to get at links between vaccination rates and COVID survivorship.
Data selection
The data sets will differ from Nate Silver’s obviously.
First, I did grab vaccination rate by state info from that NY Times link he had, which was last updated on October 20, 2022. But not the “total” vaccination rate. They had it split out by age groups:
age 65+
age 18 - 64
age 12 - 17
age 5 - 11
Then, instead of doing a total COVID death rate over all ages, and then trying to “adjust” for old folks via a percentage of the population over age 65, I just grabbed the COVID death rate for people over the age of 65. CDC WONDER is easy to use like that.
By the way, did you know I did some videos on how to use CDC WONDER? Maybe I should update that in my copious free time.
Data limitations
Now, unlike Silver, I didn’t do the “well, I started the clock at Feb 1, 2021… yadda yadda”, because it’s a pain in the ass to do this in WONDER, and I’m lazy. So I simply excluded year 2020 experience.
As a reminder, this is to exclude deaths before vaccines were available. To be sure, some people did not have access to COVID vaccines til later in 2021. Let us not quibble.
But, of course, the vaccination rates were as of a particular date: October 20, 2022. And on the NY Times article itself, they recognized multiple shortcomings of the original data. Such as people getting vaccines in states different from where they lived. I know I traveled 40 miles to get the vaccine myself in April 2021.
I live right next to the NY-CT border.
The vaccination rates were to whole percentages, so there’s a precision issue. Given the original data, I think trying to get a significant figure past the decimal point would have been pushing it, but it can cause problems, as you will see.
With respect to the death data, any time you have fewer than 10 deaths, the CDC WONDER database will not give you the count. It’s censored for privacy reasons. So… basically I cannot do anything with the under-18 crowd. COVID deaths were minimal for minors.
Correlations between the variables
Before I even try a regression model, let’s look at the correlations between the variables:
So nothing too shocking here.
But it is very interesting to see what’s strongly correlated, more strongly than others. I will graph a few of these against each other, and perhaps you will see how they’re related.
We will see the issue with the age 65+ again — you can’t go above 100% vaccination rate, and actually the real saturation point is below that because many people can’t get the shots due to health reasons or just mobility reasons. It seems to max out around 95%.
But look at those other lines — the vaccination rates for the minor age groups are almost 100% correlated with the vaccination rate of the 18-64-year-olds.
Might that be an indication not of any political affiliation, but of actual health access in different states?
Or, and just spitballing here… might it be an artifact of data shortcomings? Because I can believe that, too. A near-100 % correlation of data is very suspicious to me.
Age 65+ Regression
Let’s take a look at the Age 65+ vaccination rate & death rate:
Hmmm.
Hmmm.
Okay, let’s talk about this crap.
What we’ve got going on here is a saturation on vaccination rate at multiple states (and I haven’t even bothered to distinguish them all), with a huge range in mortality differences re: COVID for age 65+.
Another thing to notice is the horizontal scale in general. It is very restricted.
This is not a great model, but it fits better than so many asset models I’ve seen back in the day. Here are the regression stats if you want to see:
Age 18-64 Regression
This one looks better, but I’ll explain my issue in a moment.
Yes, New Mexico is an outlier. But on the whole, it’s a pretty good fit:
Unlike with the Age 65+ vax rate, there’s no saturation, there’s actually a pretty good range on vax rate as an input, and I want you to look at that vertical scale variation.
Look at the age 65+ vertical scale variation.
Age fragility to COVID
I’ve mentioned this many times before.
It was a key portion of one of my top posts: COVID and Simpson's Paradox: Why So Many Vaccinated People are Among the Current Wave of Hospitalizations
People do not realize just how fragile old people are, compared to young people. I often have to split out mortality trends by sex, but sex differences have got nothing on age. The only major split that I know of is smoker/non-smoker, and that gap, much bigger than the sex gap or racial gap (smaller than the sex gap), is nowhere near as big.
That’s why we do age-adjusted death rates, partly. But it’s really why we prefer to look at mortality trends where we have at least 10-year age groupings. We never just look at raw death rates, because the age composition of the underlying population will really drive that.
COVID behaves similarly to many other natural causes of death: mortality rates increase in an exponential rate, so that I generally have to plot the mortality rates using a logarithmic scale so you can see the small death rates for the very young.
No Analysis for the Children
And that’s all the regressions you’re going to get from this data.
Because this is what COVID death data by state looks like for 12-17:
And for 5-11:
I can’t do anything with that.
So what? Now what?
The thing is, I restricted this to some very specific data.
Things I didn’t look at:
Total mortality (dead is dead)
Pre-pandemic mortality experience (what if the issue is general public health levels?)
Other specific co-morbidities (diabetes, obesity)
I happen to know that there are other things going on at the same time.
Someone remarked, when I linked to the prior post, that I couldn’t prove causality.
Well, duh. I’m not even trying to.
The most I’m trying to do, right now, is explore potential connections. I might be able to at least strike off some possibilities.
But if I use some simple metrics, I might get misled.
The fact that so many analyses about vaccine/mortality--particularly by state (thereby inferring partisan affiliation)--lump the age groups by such vast age ranges and assume they're adequately controlling for age drives me up a wall. I can't believe that serious publications actually think that a 64 year-old has a similar risk profile to an 18 year-old. Yet in so many analyses, they're treated identically.
By lumping this group together, we completely miss whether or not the "fully vaccinated rate" is being dragged down by the youngest members of that group who are also at least risk from severe outcomes. So, we're potentially attributing the deaths of fully-vaccinated 64 year-olds with comorbidities to the decisions healthy 18 year-olds not to get the vaccine.
This emphasis on driving partisan narratives is causing people to make very dubious analytical decisions. Confirmation bias is a helluva drug.
Should we be eligible for CE credits after reading these posts? :-)