Short Communications
The worldwide search for the new mutations in the RNA-directed RNA Polymerase domain of SARS-CoV-2
Siarhei A. Dabravolski * ,
Yury K. Kavalionak

Mac Vet Rev 2021; 44 (1): 87 - 94

10.2478/macvetrev-2020-0036

Received: 24 June 2020

Received in revised form: 01 October 2020

Accepted: 12 November 2020

Available Online First: 31 December 2020

Published on: 15 March 2021

Correspondence: Siarhei A. Dabravolski, sergedobrowolski@gmail.com
PDF

Abstract

Severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) is an RNA virus, responsible for the current pandemic outbreak. In total, 200 genomes of the SARS‐CoV‐2 strains from four host organisms have been analyzed. To investigate the presence of the new mutations in the RNA-directed RNA Polymerase (RdRp) of SARS-CoV-2, we analyzed sequences isolated from different hosts, with particular emphasis on human isolates. We performed a search for the new mutations of the RdRp proteins and study how those newly identified mutations could influence RdRp protein stability. Our results revealed 25 mutations in Rhinolophus sinicus, 1 in Mustela lutreola, 6 in Homo sapiens, and none in Mus musculus RdRp proteins of the SARS-CoV-2 isolates. We found that P323L is the most common stabilising radical mutation in human isolates. Also, we described several unique mutations, specific for studied hosts. Therefore, our data suggest that new and emerging variants of the SARS-CoV-2 RdRp have to be considered for the development of effective therapeutic agents and treatments.

Keywords: SARS-CoV-2, mutation, RNA-dependent, RNA polymerases, RdRp, Nsp12


INTRODUCTION

The current pandemic outbreak is caused by the novel coronavirus isolate called severe acute respiratory syndrome coronavirus 2 (SARSCoV‐ 2). This virus is a global threat for mankind, the world economy and the ecology, as well. Research suggests that a high mutation rate and the ability for quick adaptation to new conditions allow SARS‐CoV‐2 to cross interspecies barriers and spread from the natural bats’ reservoirs to other hosts (1). The SARS-CoV-2 RdRp (also called nonstructural protein 12 - nsp12) is a key player in the multicomponent viral replication/transcription and proofreading complex. Many modern antiviral drugs are designed to specifically inactivate RdRp or to prevent its interaction with other parts of the replication machinery: co-factors nsp7, nsp8, and nsp14 – exonuclease with proofreading function (2).
In this study, we focused on the identification of mutations in the RdRp domain. In total, we have examined 200 genomes of the SARS-CoV-2 and CoV-like viruses from the “natural” host Rhinolophus sinicus (3), secondary hosts Homo sapiens and Mustela lutreola (4), and artificial host – a model organism Mus musculus (5). We also studied how those mutations would influence the stability of the RdRp domain of nsp12 protein.
Further research of the SARS-CoV-2 RdRp variants could lead to the development of more effective antiviral drugs and vaccines. Also, our data suggest that new and emerging variants of the SARS-CoV-2 RdRp have to be considered for the development of effective therapeutic agents and treatments.

MATERIAL AND METHODS

Sequences retrieval and analysis
In total 200 complete genome sequences of the SARS-CoV-2 and CoV-like viruses from different hosts have been downloaded for the analysis from NCBI database: Rhinolophus sinicus – 18, Mus musculus – 42, Mustela lutreola – 13, Homo sapiens – 127 (Supplementary Table 1-4) (6). Further, the coronavirus RNA-directed RNA Polymerase (cd21591) (RdRp domain) ORFs protein sequences have been retrieved with the NCBI ORFinder (https://www.ncbi.nlm.nih.gov/orffinder/). Conserved domains have been checked with CD-search (NCBI), respectively. Complete translated ORFs were used for the multiple sequence alignments performed with MUSCLE (7), implemented in Ugene 34 software (8), and checked for mutations.
Secondary structures (helix, sheets, and coil) were predicated with PSIPRED server (PSI-blast based secondary structure PREDiction) (http://bioinf.cs.ucl.ac.uk/psipred/) (9).

Effect of mutations on the protein stability
The effect of identified mutations on the protein stability, flexibility and motion was studied with MAESTRO on-line tool (10), Dynamut server (11), and DUET (12). MAESTRO predictions are based on artificial neural networks (ANN), support vector machines (SVM) and multiple linear regression (MLR), with ΔΔG values as an output. In addition to the Normal Mode Analysis (NMA) of the structures, Dynamut implements an algorithm to analyze the effect of point mutation(s), with a wide set of parameters, describing the influence of the vibrational entropy changes on the protein dynamics and stability. DUET server uses the advantage of two methods (SDM and mCSM) combined by Support Vector Machines (SVMs).

RESULTS

Identification of mutations
The SARS-CoV-like genome sequences from the natural host Rhinolophus sinicus have been analyzed. In total, we have identified 25 mutations in the RdRp domain (Table 1). Mutations have been found in 16 out of 18 analyzed sequences. 6 RdRp domains have had a single mutation, 5 – double mutations, 2 – triple, and 3 – multiple mutations. In general, mutations were located throughout the entire domain. There were only two rather unique mutations, T118 and D125. Those two mutations have appeared in several genomes and were shown to mutate in several amino acids: T118 to N and A; D125 to G, E and N.



Forty-two genomes of the mice (Mus musculus) - isolated SARS-CoV-like sequences have been analyzed. Surprisingly, we found no mutations in the RdRp domain (data not shown).
Thirteen genomes of the mink (Mustela lutreola) - isolated SARS-CoV-2 sequences have been analyzed. In 7 RdRp domains, we found only one mutation – P323L (Table 2).



In total 127 SARS-CoV-2 genomes from the human host were analyzed. In this analysis, the Wuhan RdRp domain sequence was counted as an original (wild type). Identified mutations are listed in Table 3 and shown in Supplementary Fig. 1 and 2 (6). Mutations in 6 positions were identified: G179S, E278D, P323L, L329I, A449V, A660S. Interestingly, the P323L single mutation was the most common and detected in more than half of the analyzed countries. The G179S single mutation was identified only in one of two analyzed Malaysia isolates (MT372481), and A660S – in the Japanese. P323L and A449V double mutation were found only in Greece isolate. Two double mutations (E278D and P323L, P323L and L329I) were found only in isolates from India (Table 3).



According to the secondary structures, G179S, E278D, A449V and A660S were located to different helices, when P323L and L329I were located on one coil (Supplementary Fig. 1 and 2) (6).

Effect of mutations on the free energy and protein stability
To determine how identified mutations could influence RNA-directed RNA Polymerase tertiary dynamics and stability, we used 3 different tools, based on discrete algorithms. Mutations have been examined individually and in combinations (Supplementary Table 6 and 7) (6). In total, the amount of the conservative and radical mutations was almost equal, although radical mutations have provided a higher effect on the free energy change. Single mutations with the highest change in free energy are highlighted in bold in Supplementary Table 6 (6). The most common mutation in human and mink isolates (P323L) (consistent with at least two used tools) is the only stabilising type of mutation. The second human mutation with the highest change in free energy is A660S, the radical mutation that was predicted to have a destabilizing effect. Single mutations of the Rhinolophus sinicus are almost equal by nature (14 radicals / 10 conservatives).
Three of them are destabilizing (A128S, G711S and L278F) and two are stabilizing mutations (H772Q and conservative R138K). The L278F mutation is the only mutation confirmed by several tools. The dual, triple, and multiple mutations with the highest change in free energy, identified in the human and Rhinolophus sinicus RdRp domain, are stabilizing types (Supplementary Table 7) (6).
The only significant positive change of ΔΔS VibENCoM was detected for the R754C radical mutation of the Rhinolophus sinicus, suggesting a gain in flexibility.

DISCUSSION

The high mutation rate is one of the main adaptation mechanisms, exploited by RNA viruses (13). Also, RNA viruses can regulate their replication fidelity (14). The combination of those two unique features provides coronaviruses with the ability for quick spreading and adaptation to the new hosts, in order to overcome natural and vaccine-induced immune response and to develop resistance to antiviral drugs (15). Severe Acute Respiratory Syndrome coronavirus (SARS-CoV) and Middle East Respiratory Syndrome coronavirus are two strains of animal coronaviruses that adapt to a human host and cause several local epidemics within the last 20 years. The current pandemic outbreak of the novel coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) is a global threat, affecting people worldwide. RNA-directed RNA Polymerase is one of the key elements of the coronavirus replication machinery and a target for several modern antiviral drugs (2). In this paper, we reported mutations in the RdRp domain in a “natural” host for the coronavirus bat Rhinolophus sinicus, model organism (Mus musculus) and “secondary” hosts - Mustela lutreola and Homo sapiens (human isolates data were collected from several countries). Our results have found stabilizing P323L mutation as the most common in SARS‐CoV‐2 human isolates around the world and as the only mutation defined in Mustela.
Although many strategies applying a computational approach to predict the effect of a mutation on protein dynamics and thermostability have been proposed, this problem is complex and still requires further research. To obtain maximally accurate data, we have used several tools based on different algorithms, thus, sometimes resulting in contradicting interpretation. MAESTRO applies artificial neural networks (ANN), support vector machines (SVM), and multiple linear regression (MLR), based on the distance-dependent residue pair and solvent exposure of protein residues statistical scoring functions (10). DUET combines two methods: Site-Directed Mutator (SDM) – a statistical potential energy function and mCSM signatures – the graph-based concept of Cutoff Scanning Matrix (CSM) (12). ENCoM method employs an Elastic Network Contact Model that is based on the coarse-grained normal mode analysis (NMA) (16). Dynamut implements a machinelearning algorithm to analyze and non-redundant blind test set to validate the effect of point mutation(s) (11). Mutations, confirmed by at least 2 tools have been counted as consistent.
In our study, we found that bat isolates have 25 mutations, the majority of which are unique. That could represent the pool of potentially useful mutations that would help the virus to adapt to the new environment, host, or fight with the immune system or drugs. Minks, on the contrary, have been described as a target species for the SARS-CoV-2 only recently and, most probably, acquire the virus from the farmworkers (in some cases with P323L mutation) (4). In our study, mice represented an unusual host for the SARS-CoV and SARS-CoV-2, because it was shown that viral replication could be reached only in inbred, knockout, or transgenic lines (5, 17), whereas the wild-type line is resistant (18). Based on the worldwide presence of the P323L mutation from the human isolates, it is tempting to speculate that this particular mutation has evolved as a result of adaptation to the human host, improving the interaction of the RdRp (nsp12) with nsp7, nsp8 and nsp14, achieving proper replication and proofreading (19). The second consistent mutation with a high change in ΔΔG (A660G) was confirmed to be destabilizing but has been identified in a single isolate from Japan.
It is known that the RdRp protein is a target for many antiviral drugs, that could bind to the RdRp protein to prevent normal functioning (20). It was shown that point mutation in the RdRp could lead to drug-resistance (21). Numerous point mutations in the RdRp protein have been described (G64, V173, F483, V560, M618, D868, L420, double K159/A239) causing resistance to the effective antiviral drugs (primarily, nucleoside analogs: ribavirin, 5-fluorouracil, remdesivir) (22, 23). Thus, our newly identified mutations could evolve as acquired resistance to antiviral drugs or host-specific antibody-escape mechanism. Further research is required to define how those mutations alter the replication/proofreading process and efficiency of the RdRp-targeted antiviral drugs. The described point mutations in the RNA-directed RNA Polymerase are associated with drug-SARSCoV- 2 isolate efficiency in a given country. That means that antiviral drug has to be checked on several isolates, specific for a particular region/  opulation.
Recently, the structure of the RdRp (nsp12) protein has been identified (24). Based on its structure, the position of the P323L mutation was located to the interface domain (residues A250 to R365). The interface domain is known to connect a nidovirus-specific N-terminal extension (NiRAN) domain (residues D60 to R249) and a right-hand RdRp domain (residues S367 to F920). It was predicted, that two effective nucleotide analog antiviral drugs, remdesivir and sofosbuvir, are binding to the nsp12, disrupting the interaction between the right-hand RdRp and NiRAN domains thus inhibiting elongation (25). Further research is required to understand the effect of the P323L mutation of the interface domain on the efficiency of the RdRp RNA synthesis and the performance of these drugs.
Several recent studies have investigated mutations in the RNA-directed RNA Polymerase, with rather contradicting results. In the recent work, Pachetti et al. (26) have described P323L mutation (signed as “14408” mutation in the manuscript) as predominant for the European population. Another paper (27) also defines P323L mutation as a cross-continent mutation, mostly specific for Europe, with only a minor presentation in the Asia region. On the contrary, the same mutation was identified as stabilizing in the Indian isolates (28). While only isolates from India have been analyzed by Chand et al with the DynaMut software. Altogether, these papers (26, 27, 28) have supported our conclusion that P323L is a worldwide, a human-host specific mutation in the RdRp domain of the nsp12 protein.
Our data suggest that the maximal change in vibrational entropy energy (ΔΔS VibENCoM) between wild type and mutant variants had a negative value, implying rigidification of the protein (Supplementary Table 6) (6).
Our data suggest that Rhinolophus sinicus as a natural host for the SARS-CoV have a wide range of mutations (both conservative and radical) that mostly do not influence protein dynamics and stability. Multiple mutations, on the contrary, have reduced free energy and provide a stabilizing effect on the protein. Mus musculus, an artificial animal model to study SARS-CoV, has no mutations in the RdRp domain. Secondary hosts (Mustela lutreola and Homo sapiens) have one common and frequent mutation P323L that was predicted to have a stabilizing effect on the protein. In addition to several conservative mutations with minor effects on the free energy, the human RdRp domain contains also 2 radical mutations (G179S and A660S) that were predicted to cause protein destabilization (Supplementary Table 6) (6).

CONCLUSION

We identified 25 mutations in the RdRp domain from the bat-isolated SARS-CoV isolates. Those mutations represent a pool of neutral mutations with mostly minor effects on the protein-free energy. Among screened human isolates we found 6 mutations, one worldwide-present mutation (P323L) was predicted to have a stabilizing effect on the protein tertiary structure. Further research is necessary to understand the effect of the described mutations on the RdRp interaction with other proteins and the emergence of the antiviral drugresistance isolates.

CONFLICT OF INTEREST

The authors declared that they have no potential conflict of interest with respect to the authorship and/or publication of this article.

ACKNOWLEDGEMENTS

This research was supported by the Vibstec State Academy of Veterinary Medicine.

AUTHORS’ CONTRIBUTION

SAD conceived and designed this research and performed the experiments. SAD and YKK carried out data analysis. SAD wrote the manuscript. YKK supervised the project. SAD and YKK reviewed and edited the manuscript.

References

  1. Wu, Z., Yang, L., Ren, X., He, G., Zhang, J., Yang, J., Qian, Z., et al. (2016). Deciphering the bat virome catalog to better understand the ecological diversity of bat viruses and the bat origin of emerging infectious diseases. ISME J. 10(3): 609-620. https://doi.org/10.1038/ismej.2015.138 PMid:26262818 PMCid:PMC4817686
  2. Huang, J., Song, W., Huang, H., Sun, Q. (2020). Pharmacological therapeutics targeting RNA-dependent RNA polymerase, proteinase and spike protein: from mechanistic studies to clinical trials for COVID-19. J Clin Med. 9(4): 1131. https://doi.org/10.3390/jcm9041131 PMid:32326602 PMCid:PMC7231166
  3. Ge, X.Y., Li, J.L., Yang, X.L., Chmura, A.A., Zhu, G., Epstein, J.H., Mazet, J.K., et al. (2013). Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor. Nature 503(7477): 535-538. https://doi.org/10.1038/nature12711 PMid:24172901 PMCid:PMC5389864
  4. Oreshkova, N., Molenaar, R.J., Vreman, S., Harders, F., Munnink, B.B.O., Hakze, R., Gerhards, N., et al. (2020). SARS-CoV2 infection in farmed mink, Netherlands, April 2020 [Internet]. Microbiology; 2020 May [cited 2020 May 23]. Available from: http://biorxiv.org/lookup/doi/10.1101/2020.05.18.101493 https://doi.org/10.1101/2020.05.18.101493
  5. Gretebeck, L.M., Subbarao, K. (2015). Animal models for SARS and MERS coronaviruses. Curr Opin Virol. 13, 123-129. https://doi.org/10.1016/j.coviro.2015.06.009 PMid:26184451 PMCid:PMC4550498
  6. Dabravolski, S. (2020). The worldwide search for the new mutations in the RNA-directed RNA polymerase domain of SARS-CoV-2 [Supplementary data and figures]. Available at: https://osf.io/xtz6a/. https://doi.org/10.17605/OSF.IO/XTZ6A
  7. Edgar, R.C. (2004). Muscle: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113. https://doi.org/10.1186/1471-2105-5-113 PMid:15318951 PMCid:PMC517706
  8. Okonechnikov, K., Golosova, O., Fursov, M. (2012). Unipro UGENE: a unified bioinformatics toolkit. Bioinformatics 28(8): 1166-1167. https://doi.org/10.1093/bioinformatics/bts091 PMid:22368248
  9. Buchan, D.W.A., Jones, D.T. (2019). The PSIPRED protein analysis workbench: 20 years on. Nucleic Acids Res. 47(W1): W402-W407. https://doi.org/10.1093/nar/gkz297 PMid:31251384 PMCid:PMC6602445
  10. Laimer, J., Hiebl-Flach, J., Lengauer, D., Lackner, P. (2016). MAESTRO web: a web server for structure-based protein stability prediction. Bioinformatics 32(9): 1414-1416. https://doi.org/10.1093/bioinformatics/btv769 PMid:26743508
  11. Rodrigues, C.H.M., Pires, D.E.V., Ascher, D.B. (2018). DynaMut: predicting the impact of mutations on protein conformation, flexibility and stability. Nucleic Acids Res. 46(W1): W350-W355. https://doi.org/10.1093/nar/gky300 PMid:29718330 PMCid:PMC6031064
  12. Pires, D.E.V., Ascher, D.B., Blundell, T.L. (2014). DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach. Nucleic Acids Res. 42(W1):W314-W319. https://doi.org/10.1093/nar/gku411 PMid:24829462 PMCid:PMC4086143
  13. Duffy, S. (2018). Why are RNA virus mutation rates so damn high? PLOS Biol. 16(8): e3000003. https://doi.org/10.1371/journal.pbio.3000003 PMid:30102691 PMCid:PMC6107253
  14. Smith, E.C., Denison, M.R. (2013). Coronaviruses as DNA wannabes: a new model for the regulation of RNA virus replication fidelity. PLoS Pathog. 9(12): e1003760. https://doi.org/10.1371/journal.ppat.1003760 PMid:24348241 PMCid:PMC3857799
  15. Irwin, K.K., Renzette, N., Kowalik, T.F., Jensen, J.D. (2015). Antiviral drug resistance as an adaptive process. Virus Evol. 2(1): vew014. https://doi.org/10.1093/ve/vew014 PMid:28694997 PMCid:PMC5499642
  16. Frappier, V., Chartier, M., Najmanovich, R.J. (2015). ENCoM server: exploring protein conformational space and the effect of mutations on protein function and stability. Nucleic Acids Res. 43(W1): W395-400. https://doi.org/10.1093/nar/gkv343 PMid:25883149 PMCid:PMC4489264
  17. Bao, L., Deng, W., Huang, B., Gao, H., Liu, J., Ren, L., Wei, Q., et al. (2020). The pathogenicity of SARS-CoV-2 in hACE2 transgenic mice. Nature 583(7818): 830-833. https://doi.org/10.1038/s41586-020-2312-y PMid:32380511
  18. Zhou, P., Yang, X.L., Wang, X.G., Hu, B., Zhang, L., Zhang, W., Si, H.R., et al. (2020). A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579(7798): 270-273.
  19. Sexton, N.R., Smith, E.C., Blanc, H., Vignuzzi, M., Peersen, O.B., Denison, M.R. (2016). Homology-based identification of a mutation in the coronavirus RNA-dependent RNA polymerase that confers resistance to multiple mutagens. J Virol. 90(16): 7415-7428. https://doi.org/10.1128/JVI.00080-16 PMid:27279608 PMCid:PMC4984655
  20. Ruan, Z., Liu, C., Guo, Y., He, Z., Huang, X., Jia, X. (2020). Potential inhibitors targeting RNA-dependent RNA polymerase activity (NSP12) of SARS-CoV-2 [Internet]. Preprints  2020030024 [cited 2020 May 23]. Available from: https://www.preprints.org/manuscript/202003.0024/v1 https://doi.org/10.20944/preprints202003.0024.v1
  21. Pfeiffer, J.K., Kirkegaard, K. (2003). A single mutation in poliovirus RNA-dependent RNA polymerase confers resistance to mutagenic nucleotide analogs via increased fidelity. Proc Natl Acad Sci U S A. 100(12): 7289-7294. https://doi.org/10.1073/pnas.1232294100 PMid:12754380 PMCid:PMC165868
  22. Neogi, U., Hill, K.J., Ambikan, A.T., Heng, X., Quinn, T.P., Byrareddy, S.N., Sönnerborg, A., et al. (2020). Feasibility of known RNA polymerase inhibitors as Anti-SARS-CoV-2 drugs. Pathogens 9(5): 320. https://doi.org/10.3390/pathogens9050320 PMid:32357471 PMCid:PMC7281371
  23. Shannon, A., Le, N.T.T., Selisko, B., Eydoux, C., Alvarez, K., Guillemot, J.C., Decroly, E., et al. (2020). Remdesivir and SARS-CoV-2: Structural requirements at both nsp12 RdRp and nsp14 Exonuclease active-sites. Antiviral Res. 178, 104793. https://doi.org/10.1016/j.antiviral.2020.104793 PMid:32283108 PMCid:PMC7151495
  24. Gao, Y., Yan, L., Huang, Y., Liu, F., Zhao, Y., Cao, L., Wang, T., et al. (2020). Structure of the RNA-dependent RNA polymerase from COVID-19 virus. Science 368(6492): 779-782. https://doi.org/10.1126/science.abb7498 PMid:32277040 PMCid:PMC7164392
  25. Wang, M., Cao, R., Zhang, L., Yang, X., Liu, J., Xu, M., Shi, Z., et al. (2020). Remdesivir and chloroquine effectively inhibit the recently emerged novel coronavirus (2019-nCoV) in vitro. Cell Res. 30(3): 269-271. https://doi.org/10.1038/s41422-020-0282-0 PMid:32020029 PMCid:PMC7054408
  26. Pachetti, M., Marini, B., Benedetti, F., Giudici, F., Mauro, E., Storici, P., Masciovecchio, C., et al. (2020). Emerging SARS-CoV-2 mutation hot spots include a novel RNA-dependent-RNA polymerase variant. J Transl Med. 18(1): 179. https://doi.org/10.1186/s12967-020-02344-6 PMid:32321524 PMCid:PMC7174922
  27. Coppée, F., Lechien, J.R., Declèves, A.E., Tafforeau, L., Saussez, S. (2020). Severe acute respiratory syndrome coronavirus 2: virus mutations in specific European populations. New Microbes New Infect. 36, 100696. https://doi.org/10.1016/j.nmni.2020.100696 PMid:32509310 PMCid:PMC7238997
  28. Chand, G.B., Banerjee, A., Azad, G.K. (2020). Identification of novel mutations in RNA-dependent RNA polymerases of SARS-CoV-2 and their implications on its protein structure. PeerJ. 8, e9492. https://doi.org/10.7717/peerj.9492 PMid:32685291 PMCid:PMC7337032


Copyright

© 2020 Dabravolski S.A. This is an open-access article published under the terms of the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Conflict of Interest Statement

The authors have declared that no competing interests exist.

Citation Information

Macedonian Veterinary Review. Volume 44, Issue 1, Pages 87-94, e-ISSN 1857-7415, p-ISSN 1409-7621, DOI: 10.2478/macvetrev-2020-0036, 2021