Analysis of flu full genomes with linked patient data

A collaboration I’m involved in with the NHS West of Scotland Specialist Virology Centre (WoSSVC) and the MRC-University of Glasgow Centre for Virus Research (CVR) have been working on an analysis of seasonal influenza A(H3N2) full genomes. We’ve put a manuscript on “Integrating patient and whole genome sequencing data to provide insights into the epidemiology of seasonal influenza A(H3N2) viruses” up on bioRxiv.

This work led by Emily Goldstein and Rory Gunson (WoSSVC) examines the potential benefits of whole genome sequencing (WGS) for surveillance of human influenza. Genetic surveillance of seasonal flu viruses currently remains focused on the haemagglutinin (HA) gene, which is understandable given the importance of HA mutations that reduce vaccine effectiveness.

Here, samples with linked patient data that had previously had a portion of their HA gene Sanger sequenced at WoSSCV were re-sequenced at the CVR using next-generation sequencing technologies. WGS plays a relatively small role in the surveillance of seasonal flu viruses so the evolutionary dynamics of the seven non-HA gene segments are less well understood. The role of genetic reassortment within influenza A subtypes (i.e. intra-subtype reassortment) is also under-explored, however it has for example been shown to temporarily increase the rate of adaptive amino acid replacements in A(H3N2).

Full genome analysis revealed a number of viruses derived from reassortment events between A(H3N2) genetic groups (as defined by HA sequence) circulating during the 2014/15 flu season. These are shown as triangles in the full genome tree below, alongside a schematic representation of their mixed evolutionary history.


Interestingly, we found a significant association between the most serious flu cases sampled (classified as “Severe acute respiratory illness” according to HPS guidelines) and infection by these reassortant viruses (odds ratio = 4.4; 95% CI: 1.3-15.5). It is possible that transient fluctuations in replication rates or virulence levels could occur in viruses derived from reassortment events due to disruptions to inter-gene coadaptations.

An alternative explanation is that severe cases are more likely to be sampled later in the flu season (see graph below) due to a potential bias away from testing milder respiratory illness further away from winter, and that we detected an association because reassortant viruses also tended to be sampled later in the season. These alternative hypotheses could be resolved with increased WGS, particularly later in the season. This would also allow us to determine whether the observed tendency of reassortant viruses to be sampled later in the season is reproducible in different winters.


Genomes sampled per week during the 2014/15 flu season. Prevalence greatest in Dec/Jan. Severe cases (in red) tend to occur later in season.

In general, I hope for an increased role for WGS in seasonal flu surveillance that could help us to further investigate the patterns we detect. This would help us develop a better understanding of the epidemiology of seasonal influenza viruses and the contributions of non-HA gene segments and inter-subtype reassortment to both disease severity and viral fitness. Adding information on vaccination history to the data we had here could be particularly powerful.