Research News

COVID radar: Genetic sequencing can help predict severity of next variant

Computer model helps researchers better prepare for the next virus variant

As public health officials around the world contend with the latest variant in the COVID-19 pandemic, U.S. National Science Foundation-supported researchers at Drexel University have created a computer model that could help better prepare for the next one.

Using machine-learning algorithms trained to identify correlations in changes in the genetic sequence of the COVID-19 virus and upticks in transmission, hospitalizations and deaths, the model can provide an early warning about the severity of new variants.

“The ability to predict not just mutations but their impact is critically important for the appropriate public health response,” said Sylvia Spengler, a program director in NSF’s Directorate for Computer and Information Science and Engineering.

More than two years into the pandemic, scientists and public health officials are doing their best to predict which mutations of the SARS-CoV-2 virus are likely to make it more transmissible, evasive to the immune system and cause severe infections. But collecting and analyzing the genetic data to identify new variants — and linking it to patients who have been sickened by it — is still an arduous process.

Because of this, most public health projections about new "variants of concern" — as the World Health Organization categorizes them — are based on surveillance testing and observation of the regions where they are already spreading.

"The speed with which new variants like omicron have made their way around the globe means that by the time public health officials have a good handle on how vulnerable their populations might be, the virus has already arrived," said Bahrad Sokhansanj, who led development of the computer model. "We're trying to give them an early warning system — like advanced weather modeling for meteorologists — so they can quickly predict how dangerous a new variant is likely to be — and prepare accordingly."

The model, published in the journal Computers in Biology and Medicine, is driven by an analysis of the genetic sequence of the virus's spike protein — the part of the virus that allows it to evade the immune system and infect healthy cells, and also the part known to have mutated most frequently throughout the pandemic — combined with a mixed effects machine-learning analysis of factors such as age, sex and geographic location of COVID patients.

"We show that future omicron subvariants are likelier to cause more severe disease," Sokhansanj said. "Of course, in the real world, that increased disease severity will be mitigated by prior infection by the previous omicron variants – that factor is also reflected in the modeling."