Wednesday, June 30, 2021

Why do differences in clinical trial design make it hard to compare COVID-19 vaccines?

By Lisa Larrimore Ouellette, Nicholson Price, Rachel Sachs, and Jacob S. Sherkow

The number of COVID-19 vaccines is growing, with 18 vaccines in use around the world and many others in development. The global vaccination campaign is slowly progressing, with over 3 billion doses administered, although the percentage of doses administered in low-income countries remains at only 0.3%. But because of differences in how they were tested in clinical trials, making apples-to-apples comparisons is difficult—even just for the 3 vaccines authorized by the FDA for use in the United States. In this post, we explore the open questions that remain because of these differences in clinical trial design, the FDA’s authority to help standardize clinical trials, and what lessons can be learned for vaccine clinical trials going forward.

What were the key differences in how clinical trials were run for COVID-19 vaccines?

Clinical trials for COVID-19 vaccines differed along a surprising number of dimensions based on manufacturer choices: the number of doses, the spacing between multiple doses, the amount of vaccine per dose, the patients studied, and the endpoints tested. Because the trials were conducted at different places and at different times, the prevalence of COVID-19 variants also differed. These differences matter because clinical trials are the best source of rigorous information about vaccine efficacy, and differences in the way those trials were conducted limits the ability to compare the vaccines meaningfully.

Doses administered. Vaccines are given in one or two doses. The J&J vaccine was studied and is administered in one dose (indeed, many people sought out J&J for this reason), but J&J is now studying the effect of a booster shot. When the Pfizer-BioNTech and Moderna vaccines were initially tested, a first dose was followed by a relatively weak immune reaction, and a second dose triggered a strong reaction. The Oxford-AstraZeneca and Novavax vaccines showed the same pattern. Nevertheless, rigorous evidence about the performance of two-dose vaccines after only a single dose is lacking, because that scenario was not tested in the pivotal clinical trials. 

Dose spacing. Two-dose vaccines were administered at significantly different intervals. Moderna was tested at a four-week interval, Pfizer-BioNTech and Novavax at three weeks. Oxford-AstraZeneca was the only prominent vaccine where trials included different spacing between doses (for a small number of patients), testing four- to twelve-week intervals. They found that longer inter-dose gaps provoked a greater immune response, which was used as support for the UK’s controversial Dec. 30 decision to expand the gap between doses—including for the Pfizer-BioNTech vaccine where changing the dose spacing hadn’t been tested. A later study found that an increased gap for Pfizer boosted immune response in the elderly. The United States saw similar proposals to expand the gap between doses to get more people vaccinated (with at least one dose) faster, but evidence for the real-world effects remains unknown—because, as the CDC has emphasized, a longer gap was not tested. 

Vaccine per dose. Vaccines differed substantially in the amount of vaccine per dose in clinical trials—and consequently in the amount of vaccine given to patients. For instance, Moderna vaccines were tested with 100 micrograms of mRNA per dose, while Pfizer-BioNTech was tested with 30 micrograms per dose. Would Moderna be effective with lower doses? In January the FDA said changes to dosing or schedule were “premature and not rooted solidly in the available evidence.” In February, Moderna published results from a Phase 2 trial showing that half-doses of 50 micrograms were as good as full-doses at generating a strong immune response, but experts cautioned against extrapolating immunogenicity data to make conclusions about real-world performance.

Study populations. Vaccines were studied in different populations, based in part on recruitment efforts and in part on trials being conducted in different countries. Globally, for instance, the J&J patient population was 45% Hispanic and/or Latinx, Pfizer-BioNTech 26% and Moderna 20%. Given disparities in COVID-19 impact on different communities, disparities in vaccine access, and a history of biased clinical trial populations, balanced vaccine demographics are particularly important. Population age also differed, though less starkly (even setting aside pediatric trials); 25% of Moderna’s patients were 65 or older, but only 21% of Pfizer-BioNTech’s. (Notably, even knowing whether the comparisons are apples to apples is nontrivial; Moderna’s reported age breakdown was 18-65/older, Pfizer’s was 16-18/16-55/55+/65+/75+, and J&J’s was under/over 60). 

Endpoints. Manufacturers also chose different endpoints for their clinical trials. What were the endpoints? Pfizer-BioNTech measured efficacy against any symptomatic infection beginning seven days after the second vaccine dose. Moderna also measured any symptomatic infection, but not until two weeks after the second dose. And J&J measured cases both at two and four weeks after its single dose—but counted only cases of moderate-to-severe COVID-19 (including a positive test). These differences make it difficult to compare even topline results. 

Variants. Finally, clinical trials occurred at different times and in different countries—which means that the prevalence of viral variants also differed. Pfizer-BioNTech’s and Moderna’s vaccines were tested before variants of concern were widely circulating, making it harder to know how effective they are against variants. J&J’s vaccine, on the other hand, was tested in South Africa when the Beta variant was spreading there, and it showed lower efficacy in preventing infection in South Africa (57%) than in the United States (72%), though still strong protection (85%) against severe illness. Three months ago we described the existing data on vaccines and variants, including the lack of clinical trial evidence for most vaccine/variant combinations. Studies from England and Scotland suggest Pfizer and AstraZeneca vaccines offer somewhat reduced protection against infection by the highly transmissible Delta variant, but similar protection against severe illness. But these studies are not randomized controlled clinical trials.

What legal authority does the FDA have to help standardize clinical trials?

As we discussed in our last post, the FDA’s decision whether to grant either emergency use authorization (EUA) or full approval under a Biologics License Application (BLA) is based on standards specified by statute. But the FDA has also published a range of regulations and guidance documents governing the clinical trial process. These requirements do compel some standardization throughout the drug development timeline, though some requirements are more procedural than substantive (such as the submission of clinical data in standard formats). 

More substantively, even though the FDA has been pushing clinical trials “in the direction of standardization” since before World War II, many important scientific aspects of the clinical trials process remain up to sponsors’ discretion. The pharmaceutical companies themselves choose, in their best scientific judgment, the right dosage and route of administration for their new products. In the case of the COVID-19 vaccines, the FDA was unlikely to second-guess the manufacturers’ choice of one versus two-dose vaccines, the spacing of those vaccines, or the dosage within each vaccine. 

From the perspective of generating comparative clinical effectiveness information, these decisions (about dosage and spacing) may be less important than decisions about clinical trial design, including enrollment and the selection of appropriate endpoints. The FDA did release helpful guidance in June 2020 to explain what it would be looking for in the development of COVID-19 vaccines. The FDA “strongly encourage[d]” sponsors to enroll “populations most affected by COVID-19, specifically racial and ethnic minorities.” Although the agency did not require sponsors to meet certain benchmarks for the diversity of their trials, Moderna did slow the enrollment of their trial briefly to ensure that their trial population was more representative of the U.S. public, after initially enrolling fewer people of color than anticipated.

The FDA also specified important features of the clinical trial design, including appropriate endpoints. Noting that “[s]tandardization of efficacy endpoints across clinical trials may facilitate comparative evaluation of vaccines,” the FDA recommended that “either the primary endpoint or a secondary endpoint… be defined as virologically confirmed SARS-CoV-2 infection” with one or more specified symptoms, and that sponsors should also evaluate severe COVID-19, defined as confirmed infection plus particular indicators of severity. Despite the FDA’s efforts to standardize these endpoints, though, the manufacturers chose—and were approved to use—endpoints which cannot be compared in a straightforward way.

Another feature of the COVID-19 pandemic may have given the government as a whole more visibility into and oversight of the clinical trials process here: Operation Warp Speed. Although the FDA does work with sponsors throughout the development process, the development of COVID-19 vaccines featured a particularly high level of government involvement and coordination of clinical trials for companies opting in to that process (as Moderna and other companies did). In theory, Warp Speed could have used that funding to attempt to exert a greater degree of standardization over the clinical trials involved here.

What lessons can be learned for vaccine clinical trial design going forward?

Greater standardization of clinical trial design could have made the existing COVID-19 vaccines more comparable. But standardization also has costs. Compelling the standardization of trials on less than ideal or ambiguous endpoints, for example, would allow easier comparisons at the cost of less informative studies. Standardization of treatment protocols—say, the number of days between shots for two-dose vaccines—would yield some comparative insights but would diminish opportunities for optimization, essentially variations that yield information about improving vaccine candidates under trial. For the COVID-19 vaccines, we still don’t know the optimal number of weeks between doses, and standardizing a given time-period from the outset would have done little to elucidate that. Ditto for the optimal dose of the vaccine, be it, for example, 100 or 30 micrograms of mRNA, or something greater, smaller, or in between. 

At the same time, compelling each manufacturer to test these variations would have come at the cost of losing statistical power and increasing the time before vaccines could be authorized—a tragedy in a pandemic where more than 10,000 lives are lost each day. Vaccine developers were able to create multiple safe and effective vaccines on an unprecedented timeline in no small part because of the simple trial designs. So, while the lack of standardization is easy to critique, the triumph of studying these vaccines under immense time pressure should be lauded.

That doesn’t mean there are no future opportunities to improve. Additional trials will be needed to resolve a variety of open scientific questions about the vaccines and the virus, such as whether boosters are needed to combat variants, as well as the safety and effectiveness of the vaccines in pediatric populations. A recent Perspective in Nature calls for a similar strategy, to design post-licensure clinical trials “addressing vaccine effectiveness, including the level of protection of both vaccinated and non-vaccinated individuals in entire targeted populations.” Making the most of these studies—and making the results comparable across vaccines—will require some effort at standardization, balanced with the need to deliver more vaccine to low-income countries without being exploitative.

While the FDA and other regulators shouldn’t necessarily micromanage trials, especially for novel technology, agencies could learn from the successes of a large-scale, umbrella clinical trial: RECOVERY. The initial vaccine trials did not have an adaptive, multi-arm design, but now that several vaccines are available, future trials perhaps should have such an adaptive design. The WHO’s 2018 plan for designing vaccine trials during a public health emergency noted the advantages of trials with “multiple vaccine candidates and a control comparator arm” and “adaptive strategies to drop poorly performing candidates.” Going forward, especially with testing on pediatric populations, this could include differing dosing options for a single vaccine.

While these lessons are directed to the COVID-19 vaccines, they should be broadly generalizable, including to non-vaccine interventions. A lack of comparative effectiveness data for the COVID-19 vaccines exists precisely because these problems exist more generally. Thinking through their solutions now should help us for the next pandemic. As the saying goes, researchers investigating clinical trial design shouldn’t let a serious crisis go to waste.

This post is part of a series on COVID-19 innovation law and policy. Author order is rotated with each post.