Flu virus: egg-adaptation, dN/dS ratios and phylogeny

In recent years there has been increasing attention paid to the consequences of egg-adaptation of influenza viruses. To remain effective, the flu vaccine is regularly updated to ensure that it remains a good antigenic match to circulating viruses. The H3N2 component of the vaccine has recently been less effective than expected, despite the vaccine strain being well-matched to circulating strains. This is partly due to changes that occur when the vaccine is grown in chicken eggs.

Circulating H3N2 viruses have a recently-aquired glycosylation site in antigenic site B that binds a bulky sugar molecule which shields this antigenic site. However when the H3N2 vaccine component is grown up in eggs, the virus acquires a T160K mutation in the haemagglutinin (HA) surface glycoprotein that eliminates this glycosylation site. Without the bulky sugar attached, the immune response to the egg-adapted vaccine includes antibodies targeting antigenic site B. The problem is that these vaccine-induced antibodies find it difficult to bind to site B of circulating viruses possessing the glycosylation site due to the bulky sugars, limiting protection.

In addition to their use in vaccine production, eggs are sometimes used for the passage of clinical isolates prior to sequencing or antigenic characterisation in haemagglutination inhibition (HI) or similar assays. In previous work on H1N1, we detected a small but significant effect of the mutation D187N in HI assays – substitutions at 187 including D187N are well characterised as egg-adaptations in H1N1.

With substitutions at 187 assumed to be an artefact of egg-adaptation with the potential to distort tree inference, we dropped this codon from phylogenetic analysis. I also noticed a strong signature of positive selection at this codon. Others have identified substitutions associated with adaptation to passage in cell culture produce spurious signals of positive selection in analysis of H3N2 HA sequences.

I decided to have another look at the impact of egg-adaptation on signatures of selection in the HA gene and the impact of substitutions at position 187 on tree inference, using a sample of 500 A(H1N1) viruses sampled between 1995 and 2009 and characterised antigenically at the Worldwide Influenza Centre.

To quantify selection pressures across HA1, I used BEAST to reconstruct the phylogenetic tree and simultaneously estimate site-specific nonsynonymous/synonymous substitution rate (dN/dS) ratios using ‘renaissance  counting‘. h1n1_dNdSThe above plot shows that the majority of sites are evolving under negative, or purifying, selection while a handful of sites show evidence of positive selection. The strongest signal of positive selection across HA1 is, by far, found at position 187. The next highest dN/dS ratio is found at 141, the position where we identified substitutions from K to E were responsible for the largest changes in the antigenic phenotype of H1N1 viruses between the late ’90s and 2009.

Among the other positions showing signatures of positive selection, 94 is an important receptor-binding residue while 186, 189 and 222 are all residues near to receptor-binding site and belong to known antigenic sites. Interestingly, the substitutions E186K and D222N/G are also known egg-adaptations in H1. It’s less obvious why positions such as 35 or 57, which are far from either the receptor-binding site or known antigenic sites, might have experienced positive selection.

The pattern of amino acid substitution at position 187 is shown on the HA1 phylogeny below, along with relative rates of substitution and the posterior probabilities of amino acid identity at 187 along the evolutionarily important trunk lineage from which all future viruses eventually descend. h1n1_187_treeThe aspartic acid to asparagine substitution (D187N) seems to be the dominant substitution responsible for the high dN/dS ratio. Given we expect egg-adaptations to occur in only the external branches of tree, it is surprising to see sizeable clades with the ancestral state at deep internal nodes estimated to be N, even including a portion of the trunk lineage. This illustrates why estimating dN/dS using only internal branches of the phylogeny does not necessarily remove the influence of egg-adaptations, an observation also made by a study looking at cell-culture adaptations in H3N2.

This tends to support our suspicion that egg-adaptations at 187 could distort the phylogenetic tree. The tree below was constructed without nucleotide data for codon 187, with ancestral amino acid state for position 187 estimated using a simple substitution model.h1n1_no187_treeViruses with 187N are less clustered and fewer internal nodes deep in the tree are estimated to possess N as an ancestral state.

Posterior probabilities on the trunk lineage suggest that egg-adaptation has the potential to influence phylogenetic inference not only near the tips of the tree but in the trunk lineage too. This is important as we’re often particularly interested in properties of the trunk lineage (e.g. geographic location) as this is the evolutionarily successful lineage that seeds epidemics in future years.

In recent flu seasons, we’ve seen that growth in eggs can result in adaptations that reduce vaccine effectiveness. In addition, egg-adaptations have the potential to impact:

  1. Antigenic characterisation of viruses
  2. Estimates of selection pressures
  3. Phylogenetic tree construction

and this influence should be considered when studying viruses propagated in eggs. Viruses can alternatively be grown in cell culture, though cell culture-adaptations are also possible and we should remain aware of their potential impact too.

Some useful links:

Paper: Identification of mutations explaining antigenic drift  including egg-adaptation at position 187. Codon 187 excluded from phylogenetic inference.

Dataset: H1N1 haemagglutination inhibition data generated at the WHO Collaborating Centre for Reference and Research on Influenza, London, UK. Includes GISAID accession numbers for sequences used here.

Tutorial: Simultaneous reconstruction of phylogenetic tree and site-specific dN/dS ratios using BEAST.

Figures created using ggplot2 and ggtree