Genetic cluster analysis of SARS-CoV-2 and the identification of those responsible for the major outbreaks in various countries

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

28 Scopus Citations
View graph of relations



Original languageEnglish
Pages (from-to)1287-1299
Number of pages13
Journal / PublicationEmerging Microbes and Infections
Issue number1
Online published11 Jun 2020
Publication statusPublished - Dec 2020



A newly emerged coronavirus, SARS-CoV-2, caused severe pneumonia outbreaks in China in December 2019 and has since spread to various countries around the world. To trace the evolution route and probe the transmission dynamics of this virus, we performed phylodynamic analysis of 247 high quality genomic sequences available in the GISAID platform as of 5 March 2020. Among them, four genetic clusters, defined as super-spreaders (SSs), could be identified and were found to be responsible for the major outbreaks that subsequently occurred in various countries. SS1 was widely disseminated in Asia and the US, and mainly responsible for outbreaks in the states of Washington and California as well as South Korea, whereas SS4 contributed to the pandemic in Europe. Using the signature mutations of each SS as markers, we further analysed 1539 genome sequences reported after 29 February 2020 and found that 90% of these genomes belonged to SSs, with SS4 being the most dominant. The relative degree of contribution of each SS to the pandemic in different continents was also depicted. Identification of these super-spreaders greatly facilitates development of new strategies to control the transmission of SARS-CoV-2.

Research Area(s)

  • Betacoronavirus/classification, China/epidemiology, Cluster Analysis, Databases, Genetic, Disease Outbreaks, Genome, Viral, Global Health, Humans, Mutation, Phylogeny, Risk Factors, Sequence Alignment, Sequence Analysis, Severe Acute Respiratory Syndrome/epidemiology, Virulence

Bibliographic Note

Month information for this publication is provided by the author(s) concerned.

Citation Format(s)

Download Statistics

No data available