We Don’t Have Accurate and Reliable Data on How Effective the Covid Vaccines Actually Are

Koen Swinkels
15 min readApr 4, 2022

At this point we have huge amounts of data from all over the world that are relevant to determining how effective the vaccines are against infection, hospitalization, death and transmission.

We do not, however, have any really good data.

Having really good data would mean knowing how many Covid infections, hospitalizations, deaths and transmissions occur in:

  • a group of vaccinated people
  • a similar group of unvaccinated people

in similar circumstances.

That is, the people in the two groups would have to be similar in terms of age, sex, comorbidities, health-related behavior and so on, and the circumstances would have to be similar in terms of virus incidence, living situations, government restrictions, healthcare facilities and so on.

If we had such numbers we could straightforwardly compare how often infection, severe disease and death occur in each group, and on that basis calculate vaccine effectiveness, expressed in percentages, using this formula:

(attack rate in unvaccinated group minus attack rate in vaccinated group)

divided by

(attack rate in unvaccinated group)



But we don’t have such numbers. And the reason is that it is very difficult to find two groups that are similar in all relevant aspects except their vaccination status.

In general, establishing what the causal effects of pharmaceutical interventions are is fraught with difficulties. In the processes of producing, collecting, handling, interpreting, and presenting the data necessary to determine how effective the vaccines are, there are countless steps that introduce potential sources of bias, inaccuracy, or even downright deception.

The best way to avoid most of these problems is through well designed, properly conducted, double blind randomized controlled trials (RCTs).

There are many good reasons to be cautious if those data do not directly come from such trials, especially in an atmosphere in which there are strong incentives and tremendous pressure to make the vaccines look as good as possible.

And sadly, we simply don’t have RCTs that provide high quality data to tell us how effective these Covid vaccines actually are against:

  • transmission
  • symptomatic infection in the medium and long-term
  • severe Covid disease
  • Covid mortality
  • all-cause mortality

The absence of medium and long-term RCTs, and of high quality active and passive adverse events surveillance systems, also means that we do not have good evidence about how safe these vaccines are, especially in the long-term.

The only RCT data that we do have — from the RCTs conducted by the pharmaceutical companies that developed the vaccines — provide us only with relatively short-term data on how effective the vaccines are at preventing symptomatic infection, which in for example the Pfizer trial was defined as a positive PCR test + one or more symptoms:

The trials were not designed to demonstrate effectiveness against severe disease or death. The number of trial participants— especially the number of elderly people and people with severe comorbidities who are most vulnerable to severe disease and death — was simply much too small.

Moreover, the control groups were unblinded a few months into the trials because it was deemed unethical to deprive them of what then appeared to be a highly effective vaccine. Unblinding and vaccinating people in the control group meant that there could not be any meaningful medium and long-term safety and effectiveness data.

Lastly, there have been credible accusations in for example the British Medical Journal that the Pfizer trial was not actually properly conducted, nor reliably double-blinded even before the official unblinding took place. This would introduce potential bias in even in the most basic data gathering points of the trial: For example, the determination whether something should count and be recorded as a symptom is to some extent a subjective judgement made by a trial investigator.

Most of the other evidence we have about vaccine effectiveness comes from:

  • observational studies
  • cases/hospitalizations/deaths numbers released by public health institutions
  • experimental data showing various types of immune responses
  • governments and media reporting on such data

All of these can be highly problematic in a variety of ways although of course some non-RCT data are higher quality than other data, and some institutions and media are more capable or honest in how they interpret and represent those data than others.

Biases in the Processing and Presentation of Data

When comparing the number of cases, hospitalizations and deaths that occur in vaccinated and unvaccinated populations, scientists, public health institutions and media will ideally:

  • present results as daily rates relative to population size instead of as absolute numbers and/or totals over a longer period: When comparisons are made over an extended period in which at least some people are getting vaccinated, the comparison should involve calculating for each person how many days of that period they were in the vaccinated category and how many in the unvaccinated category. For the two denominators of the relative to population rate calculations all vaccinated person-days should be combined in one group and all unvaccinated person-days in another group. For the numerator all cases (or hospitalizations or deaths) that occurred in people who were vaccinated at the time of getting infected should be combined in one group, and all cases that occurred in unvaccinated people combined in the other.
  • use reliable numbers for total population sizes of the groups of vaccinated and unvaccinated people, and for the number of cases, hospitalizations or deaths that occurred in each group: Health insurance databases are ideal for this purpose as they offer complete and known population data sets that also have — or can be used in combination with other datasets that have — reliable data on the number of cases/hospitalizations/deaths in each group.
  • continually adjust the population sizes of the group of vaccinated and unvaccinated: The population of vaccinated and unvaccinated people constantly changes as a result not just of people getting newly vaccinated, but also because people die or move in or out of an area. The denominators in the calculations should be continually adjusted to account for these changes.
  • match people in one group to those people in the other group who are similar in terms of age, comorbidities and (absence of) confirmed prior infection: Vaccinated and unvaccinated populations may differ from each other in respects that are relevant for meaningful comparisons. If the elderly are overrepresented in the vaccinated populations, then this will bias the data against vaccination. To adjust for this a selection needs to be made so that the two groups being compared are and remain similar.
  • make adjustments for variability in incidence throughout a period: If, for example, in Canada the period under observation is January — August 2021 then there will be an overrepresentation of cases, hospitalizations and deaths among the unvaccinated because as the vaccination campaign progressed the percentage of unvaccinated people was 1) high during the winter and early spring period in which respiratory virus activity is typically high, 2) low in the summer period in which such virus activity is typically low.
  • not add people who are within two weeks of a dose to the group who has not yet had that dose: It can take up to two weeks before a dose will start to have a protective effect. This is why cases that occur in people within two weeks of e.g. their first dose are typically not included in the cases for the group of people who have been vaccinated with that first dose. Sometimes they are put in a category of their own (‘vaccinated but not yet protected’) and sometimes they are added to the cases in the ‘unvaccinated’ category. If they are added to the cases in the unvaccinated category and the population size of that group is not similarly adjusted, this will bias the data in favor of vaccine effectiveness. The distortive effect can be surprisingly large, as professor Norman Fenton explains in this short clip (full version, research paper, accessible explanation):
  • account for the temporarily increased susceptibility to infection in the period directly following vaccination: The distortive effect just mentioned is significantly amplified as a result of the empirical fact that in those first two weeks after vaccination people are not just not yet protected but actually more likely to get infected (and hence subsequently being hospitalized or dying) than unvaccinated people are. So adding those people to the group of unvaccinated instead of to the vaccinated group or in a group of their own, further increases the case/hospitalization/death rate in the unvaccinated group and decreases it in the vaccinated group.
  • account for the effect of reporting delays: as Norman Fenton explained in the clip above, a similar distortive effect can be achieved as a result of failure to account for reporting delays.
  • adjust for differences in testing willingness and testing requirements: Vaccinated and unvaccinated people may be subject to different testing requirements in society, or they may differ in their willingness or readiness to get tested. All else being equal, if one group gets tested more often than the other, there will be more cases in that group. Studies should take this into account.
  • account for differences in risk-seeking behavior between vaccinated and unvaccinated people: Getting vaccinated may embolden people to engage in riskier behavior than they did when they were still unvaccinated. Alternatively, unvaccinated people may be less concerned about the virus and on average engage in riskier behavior than unvaccinated people. These differences need to be taken into account.
  • look not just at Covid hospitalizations and deaths but all-cause hospitalizations and deaths as well: If the vaccines reduce Covid hospitalizations and deaths but themselves cause adverse events that result in hospitalizations and deaths then this is important information when evaluating the effectiveness of the vaccines. Moreover, it is also at least a theoretical possibility that the vaccine reduces the likelihood of testing positive for Covid when experiencing Covid-like disease but not to the same extent the likelihood of Covid-like disease itself, or the hospitalizations or deaths that result from it. If all-cause hospitalization and death data are not taken into account, then the vaccines may appear more effective at preventing hospitalization or death than they in fact are.
  • correctly interpret differences in all-cause hospitalizations and deaths: When observational studies match people in the vaccinated group with people in the unvaccinated group who are relevantly similar with regard to age, sex, comorbidities and other factors, and there are significant differences not just in Covid hospitalization and death rates but in all-cause hospitalization and death rates as well, this likely does not indicate that the Covid vaccine protects against all-cause hospitalization and death. Instead it suggests there are relevant behavioral differences that independently explain 1) the willingness to get vaccinated, 2) Covid infection, hospitalization and death, 3) all-cause hospitalization and death. One such possible explanation is how responsibly people behave with regard to their own healthcare, such as taking a vaccine that is promoted as safe and effective, avoiding situations that are high-risk for SARS-CoV-2 transmission and seeking medical care if sick, and reliably taking the medication prescribed for an existing condition.
  • distinguish between hospitalizations and deaths ‘with Covid’ and ‘due to Covid’: If, for example, the vaccines are effective at preventing severe Covid disease but not Covid infection then not distinguishing between on the one hand hospitalizations and deaths that were due to other causes but accompanied by a Covid infection, and on the other hand hospitalizations and deaths that were due to Covid, will appear to make the vaccines less effective against Covid hospitalization and death than they in fact are. Or if fully vaccinated people without Covid symptoms are not routinely tested upon hospital admission for non-Covid reasons while unvaccinated people are, then not distinguishing between hospitalizations for and hospitalizations with Covid will make the vaccine seem more effective at preventing hospitalization than they in fact are as incidental hospitalizations of unvaccinated people will be counted but incidental hospitalizations of fully vaccinated people will not be.

This is by no means an exhaustive list of the criteria that studies that compare results in vaccinated and unvaccinated people should meet. But it is enough to give an impression of just how difficult it is in the absence of well designed and properly conducted double blind RCTs to make such comparisons in meaningful and accurate ways.

Biases in the Generation and Collection of Data

When it comes to vaccine effectiveness data biases creep in not just in how data are handled but in how data are generated and collected as well. For example, case rates may be influenced by differences in testing behaviors:

  • Demands from employers or family could mean that unvaccinated people are tested more frequently than unvaccinated people.
  • Alternatively, unvaccinated people may be more opposed to the restrictions, and as a result less willing to get tested than vaccinated people.
  • Similarly, vaccinated people may be less likely to seek medical attention because they assume that their vaccine protects them. Or people who are more anxious may be both more likely to get vaccinated and more likely to seek medical attention if they experience symptoms than unvaccinated people are.

And when it comes to hospitalization and deaths data, hospitals and doctors may have different Covid testing and coding rules and practices for vaccinated and unvaccinated people:

  • Hospitals may routinely test unvaccinated people without Covid symptoms but not fully vaccinated people without symptoms, under the assumption that fully vaccinated people are much less likely to have and transmit Covid.
  • To the extent that in their reporting hospitals distinguish between hospitalizations with and for, doctors may be more inclined to think a person’s condition is not due to the Covid infection but the result of other factors simply because they assume the vaccine is protecting the patient against severe Covid disease.
  • Similarly, when patients present to ER with Covid-like symptoms, hospitals may be more likely to admit them if they are unvaccinated than if they are vaccinated because they assume unvaccinated people are much more likely to become severely ill than vaccinated people are.
  • For a variety of reasons hospitals may sometimes code patients for whom upon admission vaccination status cannot be determined as ‘unvaccinated’ rather than ‘unknown’, let alone as ‘vaccinated’.
  • When doctors determine the cause of death of a patient they may be more likely to put ‘Covid’ as that cause if the person was unvaccinated than if they were vaccinated because they assume the vaccine protected against severe Covid disease.

These are just some of the ways in which slight biases and differential treatments can distort the raw data that will be used by researchers to compare differences in case, hospitalization and death rates between vaccinated and unvaccinated people.

By no means are all these factors present everywhere and all of the time. Some hospitals and doctors will have stricter rules and practices in place that reduce the effect such biases could have than others do.

These decisions are also typically made on the margin: If a fully vaccinated person is hospitalized with obvious signs of severe Covid disease then doctors will be as unlikely to disregard the patient’s positive Covid test as they would in the case of a similar Covid patient who is unvaccinated. But in less clear-cut, more ambiguous situations doctors might make different reporting and treatment decisions for unvaccinated and vaccinated people.

Such different decisions do not imply that the doctors and hospitals intend to bias the results to make vaccination seem more effective. Many of the examples mentioned involve reasonable decisions, based on reasonable assumptions, that healthcare professionals have to make in situations in which time, resources and information are scarce. But the net effect may be as distortive as when there was an intention to deceive.

It is however also not completely out of the realm of possibility that some doctors may subconsciously or consciously provide their vaccinated patients with better treatment than their unvaccinated patients, as a result of their personal antipathy or their sense that scarce resources are better spent on people who ‘did their part’. An example is what this influential doctor wrote (and apparently felt comfortable enough to write) in a discussion of how Ontario hospitals should use the scarce resource of lifesaving monoclonal antibodies:

Such examples, one would hope, are rare.

Biases Add Up

The individual net effect of each or most of the biases just mentioned will be small but when several such biases are at play they can add up. This article, for example, discusses several ways in which murky record keeping and distorted ways of selecting and presenting data resulted in significantly distorted conclusions about vaccine effectiveness.

Moreover, such biases may also cause a self-fulfilling prophecy effect. For example, the assumption that vaccines are highly effective at preventing severe disease can result in several biases that in turn make the vaccines seem more effective at preventing severe disease. Over time the realization that the vaccines are not as effective as previously thought may also result in changes in the protocols and practices that then in turn result in data that further confirm this. It is possible that part of the appearance of waning vaccine effectiveness in the data is in fact the result of protocols and practices being changed in response to indications that the vaccines are less effective than previously thought.

What the total effect of all these potential biases is — how it will all shake out — is impossible to determine. The best thing to do is simply avoid them. And the best way to do that is through well designed, properly conducted and double blind RCTs. Blinding the participants, for example, removes many of the potential biases that could result from vaccination causing changes in behavioral patterns such as risk-seeking or willingness to get tested. Making the distribution of the real vaccines random means that different personality types won’t be overrepresented or underrepresented among either the vaccinated or unvaccinated, which would have resulted in different tendencies to get tested or seek medical attention. And double blinding removes the biases that might lead doctors and hospitals to make different reporting, coding and treatment decisions for vaccinated and unvaccinated people.

Moreover, if a trial is well designed and properly conducted it will for example require each participant to get tested regularly and on a fixed schedule, not just if a patient experiences symptoms.

Alas, as mentioned before, we simply don’t have vaccine efficacy data from well designed, properly conducted and double blind RCTs other than data from the (possibly not quite so well conducted and possibly not quite reliably double blinded) RCTs conducted by the pharmaceutical companies, or rather, the companies their hired to do that for them. And these only give us data about short-term efficacy against symptomatic infection.

So we don’t have pristine data devoid of distortions that are caused by biases in the data collection. But we can in principle still avoid many of the above-mentioned pitfalls when it comes to the handling, representation and interpretation of the data. In practice, however, very few of the reports published in the media and/or by public health institutions take care to avoid these pitfalls.

This CBC report, for example, reaches spectacular conclusions about vaccine effectiveness, but while its methodology avoids some of the pitfalls mentioned above it fails to avoid several others, which I will leave as an exercise for the readers to identify in the article:

Even academic papers will typically fail in one or more respects. And even the ones that are careful to avoid as many as possible, will typically still have to rely on imprecise estimates for some of the numbers they need to do the calculations used for their comparisons.

For example, in populations with very high vaccination rates it is very difficult to accurately estimate the population of unvaccinated people. Typically that population is estimated by taking the total number of people in that population and subtracting the number of vaccinated people from it. Because public health institutions keep track of the number of vaccinations in that area the number of vaccinated people can typically be reliably estimated, although people moving in and out of the area complicates things somewhat. But to estimate the number of unvaccinated people you need reliable estimates of the total number of people in an area. And if vaccination rates are very high, then relatively small differences in these estimates of the total population will cause relatively large differences in estimates of the unvaccinated population, and hence in the rates of infections, hospitalizations and deaths in the unvaccinated population.

To give an extreme example, in Ontario the vaccination rate of people 80 years and older is listed as 99.99%. The total population of that age group is 655,835. That implies there are only 66 unvaccinated people 80+ in the province. But suppose the total population in that age group is underestimated by a mere 1,000, or 0.15%. That means the unvaccinated population would be 1,066 instead of 66, or 16 times as high. So a small difference in the estimate of the total population means a huge difference in the unvaccinated population, and hence in the rate of infections, hospitalizations and deaths in that age group.

No Excuse

In conclusion, with all the data purporting to tell us something about how effective the vaccines are, we should keep in mind that those data are very much imperfect and should be treated with considerable caution. Even more so when the people involved in the generation, collection, processing and presentation of the data have a strong belief in the effectiveness of the vaccines, such that these assumptions can unintentionally introduce biases that distort the data. Let alone when they may have incentives to intentionally distort and misrepresent.

Even absent deliberate misrepresentation, in the current atmosphere reports that are relatively careless in adjusting for biases that result in a more favorable view of the vaccines will tend to face considerably less scrutiny than reports with a less favorable view.

The solution for all this lies in well designed, properly conducted and double blind randomized controlled trials. These are expensive. But governments have spent enormous amounts of money on their pandemic response in general and the vaccination program in particular. There is no possible justification for not spending a tiny fraction of that money on the only method that generates reliable and accurate data to inform this response.