(Asian News Hub) – The pandemic of Coronavirus Disease 2019 (COVID-19) started in Wuhan, China in December 2019. The virus ‘Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)’ is from the family coronavirus with genus Betacoronavirus, which include among others ‘Middle East respiratory syndrome-related coronavirus (MERS-CoV)’ and ‘Severe acute respiratory syndrome-related coronavirus (SARS-CoV)’.
The progenitor of the SARS-CoV-2 infects bats,
which on mutation passed on to humans with an intermediate host likely Pangolins in the
Chinese human seafood wholesale market.
SARS-CoV-2, the virus responsible for COVID-19, is a single-stranded positive-sense RNA, 39 kb, and made up of around 30,000 nucleotides arranged in sequence.
There are many regions in the genome that code 29 proteins (4 structural and other non-structural mostly enzymes).
The structural protein includes the Spike protein (1273 aa), Envelope protein, Membrane protein, and the Nucleoprotein named as S, E, M & N proteins. All these proteins have their functions meticulously cut out for each other to make the virus highly pathogenic.
Spike protein is involved in attachment to ACE2 receptors at the cell membrane in the lungs and other multiple organs and allows viral entry into the cells; Nucleoprotein binds the viral RNA in the nucleocapsid and involved in several functions after viral fusion; Envelope protein forms viroporins and helps virus release from infected cells and Membrane protein gives scaffolding to viral assembly and mediates the inflammatory response to viral infection.
SARS-CoV-2 as other RNA viruses show mutations during virus replication (making copies of
its genome) or under the effect of mutagens or stress from therapeutic interventions like plasma
therapy, use of monoclonal antibodies, etc.
At mutation, there is a change in the genome from
alterations of the nucleotide sequence, insertion, deletion, or rearrangement of larger sections
of the genes.
The dictum of the mutation is that the more the viruses circulate, the more the virus will change and mutate. Some RNA viruses which include SARS-CoV-2 have repair enzymes to correct errors in replication.
Even so, the mutations occur in SARS-CoV-2 as a
reflection of massive infection load and the corrective enzyme cannot abolish mutations but can only prevent mutation catastrophe.
Most viral mutations have little to no impact on the
virus’s ability to cause infections and disease.
However, some mutations can have many effects
on the virus’s abilities to better adapt to the environment as compared with the wild strain which
includes enhanced transmissibility, increased severity of disease, the occurrence of disease in
the younger population, immune escape (vaccine or previous infection), and drug/antibody
resistance.
This is dependent upon the site of mutation in the genome and its reflection on the changes in the protein that region encodes. For SARS-CoV-2 mutation in Spike protein especially in Receptor binding Domain (RBD) and N Terminal Domain (NTD) have a major influence on the capacity of the virus to attach and infect cells.
However, mutations can occur in genome encoding regions outside the Spike protein and even in non-structural parts of the
genome.
What is a variant?
A variant is a virus that has changed in the viral genome through these mutations, leading to changed adaptation to the environment as compared to wild strain.
For a new variant, there are several mutations at multiple sites in the genome which get encoded and
reflected in structural or non-structural proteins of the virus. In virology, a clade describes groups of similar viruses based on their genetic sequences, come from a common ancestor and changes in those viruses can also be tracked using Phylogeny, while a lineage means a single line of descent in a phylogenetic tree.
The story of SARS-CoV-2 variants has been extensively studied right from the time it was sequenced from infections in Wuhan, China. The original clade causing the first reported infection was designated as Clade 19A. Soon, the virus diversified into two 2 clades namely 19A (L) and 19B (S). 19 represented the year; A and B were 2 clades and L & S represented aa Leucine and aa Serine represents the critical site in the encoded virus S protein.
In Feb 2020, a major mutation (D614G;) occurred in the genome of the virus encoding Spike protein
(Receptor binding domain), and because of this, the virus became more efficient to attach to ACE2 receptor and had enhanced transmissibility.
At present all the virus strains globally belong to this clade.
The letter D, which is a single letter code for aspartic acid was replaced by G, which represents a single code letter for Glycine at aa 614 of the 1273 long Spike protein.
The presence of single aa Glycine gave all the advantage to the virus for better attachment and
penetration into cells.
Following this and over the period SARS-CoV-2, the virus responsible for COVID-19 has seen the development of many variants. These include and are designated by the name of the country/region where the zero cases of the variant were first detected.
Out of all, four variants are of great global public health importance namely UK Variant, South Africa Variant, Brazil Variant, and lastly the Indian Variant.
As the variants of the virus were evolving, the naming of these became more and more complicated and as of today, there are several systems of confusing names including:
i. Pango lineage
ii. Mutations (Substitution/deletion etc)
iii. Nextstrain clade
iv. Significance on virus attributes
Pango (Phylogenetic Assignment of Named Global Outbreak) lineage naming system is the most popular and represents the virus lineages which means a single line of descent in a phylogenetic tree.
It is represented by an alphabetical prefix and a numerical suffix. Each dot in the numerical suffix means “descendent of” and is applied when one ancestor can be clearly identified. The letters I, O, and X are not used in the prefix of the names of standard lineages.
So, lineage B.1.1.7 is the seventh named descendent of lineage B.1.1; B.1.351, is the 351st descendant of the B.1 (the virus which caused the Italian epidemic) and C.1 is the first-named descendent of lineage C.
The suffix can contain a maximum of 3 hierarchical levels, referred to as the primary, secondary and tertiary suffixes. To avoid four or more suffix levels, a new lineage suffix is introduced, which acts as an alias.
For example, C is an alias of B.1.1.1 hence
the descendent of B.1.1.1 is called C.1 (rather than B.1.1.1.1). Consequently, the name C, by
itself, is never directly applied to a sequence.
Mutations (Substitutions/deletions etc) are represented by a single letter code for the originally
placed amino acid in the wild strain on the left side, followed by the aa number in the middle
and a single letter code for the newly placed amino acid in the variant.
The Nexstrain clade system defines the genetic groups (clades) and for SARS-CoV-2 19A is the root clade. The new clades are represented by the year variants emerged, a letter representing the next variant and followed by signature mutations.
Lastly, the significance of 3 attributes of the variant has been given names which include Variant of Interest (VOI), Variant of Concern (VOC), and Variant of great Consequences; representing the grades of variants abilities for transmission, the severity of disease, immune evasion, and resistance to standard therapies and the proposed action to face the consequences.
With all this information in the background, four Variants are detailed out as under:
B.1.1.7 lineage (UK Variant) has 23 mutations from the wild strain of which 8 in spike protein. Of the 8, three mutations are critical including N501Y, del69/70, and P681H. This Variant has around 50% increased transmission and potential of increased severity based on hospitalization and case fatality rates. The variant has minimal impact on monoclonal antibody treatment and immune evasion/escape.
B.1.351 lineage (South African Variant) detected in Nelson Mandala Bay has 21 mutations of
which 9 are in the Spike protein, 3 being of particular interest including K417N, E484K, and N501Y. The variant has around 50% increase in transmission, a significant decrease in susceptibility to monoclonal antibodies and immune evasion/escape property.
P.1 lineage (Brazilian variant) involved in outbreaks in and around Manaus, the capital of the Brazilian state of Amazonas has 17 mutations, ten of which are in its spike protein, including
three designated to be of particular concern: N501Y, E484K, and K417T. The variant causes a significant decrease in susceptibility to monoclonal antibodies and the possibility of immune/vaccine escape.
P2 linage variant occurs throughout Brazil and has 3 mutations E484K, D614G, and V1176F. It shares only one mutation (E484K) of concern with the PI
variant. The variant has the potential to reduce susceptibility to monoclonal antibodies and the
possibility of immune evasion.
B.1.617 lineage (the Indian Variant) discovered from India has 13 mutations, three of which
are in the Spike protein code including E484Q, L452R, and P681R. It has been named as ‘double mutation’ based on 2 mutations namely E484Q and L452R in the Spike protein code, which is a misnomer. Based on 3 mutations in the Spike protein code, it can be then called variant with triple mutation, which should be discouraged.
This variant has 3 sublineages including B.1.617.1, B.1.617.2, and B.1.617.3. While B.1.617.3 shares the L452R and E484Q mutations found in B.1.617.1; B.1.617.2 does not have the mutation E484Q. B.1.617.2 has the T478K mutation, not found in B.1.617.1 and B.1.617.3. Despite its name, B.1.617.3 was the first sub-lineage of this variant to be detected, in October 2020 in India.
This sub-lineage has remained relatively uncommon compared to the two other sublineages, B.1.617.1 and B.1.617.2, both of which were first detected in December 2020. There were few known cases of B.1.617 (of all sublineages) until early February 2021 when there was a significant increase. ENDS….
Dr MS Khuroo is Former Director, Professor & Head Gastroenterology, Chairman, Dept. of Medicine, Sher-I-Kashmir Institute of Medical Sciences, Srinagar; Former Consultant & Head Gastroenterology and Liver Transplantation, King Faisal Specialist Hospital & Research Centre, Riyadh, Saudi Arabia