Résumé : Group A Streptococcus (GAS) M protein is an important virulence factor and potential vaccine antigen, and constitutes the basis for strain typing (emm-typing). Although >200 emm-types are characterized, structural data were obtained from only a limited number of emm-types. We aim to evaluate the sequence diversity of near-full-length M proteins from worldwide sources and analyse their structure, sequence conservation and classification.MethodsGAS isolates recovered from throughout the world during the last two decades underwent emm-typing and complete emm gene sequencing. Predicted amino acid sequence analyses, secondary structure predictions and vaccine epitope mapping were performed using MUSCLE and Geneious software.Results1086 isolates from 31 countries were analysed, representing 175 emm-types. emm-type is predictive of the whole protein structure, independent of geographic origin or clinical association. Findings of an emm-type paired with multiple, highly divergent central regions were not observed. M protein sequence length, the presence or absence of sequence repeats, and predicted secondary structure was assessed in the context of the latest vaccine developments.ConclusionsBased on these global data, the M6 protein model is updated to a three representative M protein (M5, M80, M77) model, to aid in epidemiological analysis, vaccine development and M protein-related pathogenesis studies.