Protein structure and antigen attractiveness

Great efforts are made to understand what makes particular areas of a pathogen protein attractive to the immune system. A better understanding of the biophysical and structural features that underpin antibody-recognition of antigens may allow us to infer the relative importance of the different areas (epitopes) recognised by antibodies and to predict which mutations are most likely to result in new pathogen strains able to evade pre-existing immunity.

In much of my previous research I’ve worked with data on the antigenic similarity of viruses and genetic sequence data, in order to identify the amino acid substitutions contributing to antigenic evolution. I’ve then subsequently mapped the locations of identified substitutions back to the 3D structure in order to visualise the locations of important antigenic regions on the protein surface.

An alternative approach is to predict the areas of a protein most likely to be attractive to the immune system by evaluating properties of biophysical structure. The haemagglutinin (HA) of the H3N2 strain A/Aichi/2/68 is shown below coloured by predicted epitope score estimated using BEpro, a program that computes the likely antigenicity of each residue  according to 1) an antibody attractiveness score assigned to each amino acid, 2) the orientation of the residue’s side chain, and 3) the extent to which the residue is exposed on the protein surface.

h3n2_beproThe surface membrane of flu virus is studded with HA glycoproteins, which recognise host cell receptors initiating attachment and cell entry. The HA is also the principal target for antibodies raised in response to infection or vaccination. Understanding how mutations affect antibody recognition of HA is vital as these mutations allow the virus to escape existing immunity and necessitate regular reformulation of the flu vaccine.

The results of the BEpro analysis are intuitively satisfying with exposed ridges we might expect to appear attractive to antibodies possessing higher scores. The program is not aware of wider virus structure or HA function. Consequently residues at the bottom of the stalk domain are predicted to be antigenic despite being embedded in the lipid membrane and unavailable for antibody binding. We also know that while antibodies may recognise epitopes on the stalk domain, epitopes on the head domain close to the receptor-binding site are particularly important as these can block recognition of host cell receptors.

The level of correspondence between the structure-based epitope prediction and recognised antigenic sites derived from monoclonal escape mutant studies, genetic analysis, and targeted mutagenesis is shown below.Slide2

As expected, residues belonging antigenic sites tend to have reasonably high, however it residues belonging to antigenic site E have noticeably lower predicted antigenicity than those belonging to sites A-D. This is perhaps consistent with a reduced role for site E in antigenic evolution; for example, Łuksza and Lässig observed that substitutions in site E are not informative predictors of A(H3N2) evolution, in contrast to sites A-D.

To get an idea of the relationship between predicted attractiveness and selection acting upon the HA gene, I had a quick look at the correlation between the predicted epitope score and the normalised dN/dS ratio calculated across internal branches of the phylogeny, estimated using HyPhy.


Generally, residues considered more likely to belong to epitopes, based on structure, are more likely to have experienced positive selection (normalised dN/dS > 0). A linear model fitted to the relationship only explains around 3% of the variation, however there is a notable lack of inferred positive selection at residues with lower predicted epitope scores. It is worth noting that while selection has been estimated across the evolution of A(H3N2), the predicted epitope scores are based on a single HA structure, however the attractiveness of an amino acid site may change as substitutions occur.