Gossypetin Derivatives are also Putative Inhibitors of SARS-COV 2: Results of a Computational Study

Derivatives


INTRODUCTION
Severe Acute Respiratory Syndrome (SARS) was the fi rst new infectious disease identifi ed in the twenty-fi rst century.This acute and often severe, respiratory illness is caused by a new coronavirus, SARS-Coronavirus (SARS-CoV).Between 2002-2003, the fi rst outbreak of SARS reached all fi ve continents by air-travel routes.
SARS-CoV is a zoonotic virus that resides in hosts that form its natural reservoir, such as bats.It can also infect intermediate hosts, such as small animals (for example, palm civets), before being transmitted to humans.SARS-CoV can infect and replicate in several cell types in humans and cause serious organ injuries [1].
A novel coronavirus strain linked with fatal respiratory illness was reported in late 2019.At the beginning of 2020, the World Health Organization (WHO) permanently named the 2019-nCoV pathogen as SARS-CoV-2 and the causing disease as Coronavirus Disease 2019 (COVID-2019).SARS-CoV-2 is the third most highly virulent human coronavirus of the 21st century followed by the SARS-COV and MERS-COV.SARS-COV-2 is the seventh strain of human coronavirus.It is taxonomically placed in the Genre Beta coronavirus, and exhibits 89.1% and 60% nucleotide sequence similarity with SARS and MERS coronaviruses, respectively [2,3].Giguet  Coronaviruses are large enveloped positive single strand RNA viruses.They are highly evolving, with a high frequency of gene recombination and mutation [4,5].Like other betacoronaviruses, the genomic organization of SARS-CoV-2 consists in a 5'-Untranslated Region (UTR), a replicase complex ORF1AB encoding Non-Structural Proteins (NSPS), a Spike Protein (S) gene, an Envelope Protein (E) gene, a Membrane Protein (M) gene, a Nucleocapsid Protein (N) gene, and a 3'-UTR containing several unidentifi ed nonstructural open reading frames (Figure 1).
Usually, beta-coronaviruses genome transcribes an approximately 800 kDa polypeptide.This polypeptide is proteolytically cleaved to generate various proteins.The proteolytic process is mediated by a Papain-Like Protease (PLpro) and a 3-Chymotrypsin-Like Protease (3CLpro).3CLpro gene is located at the 3'-end and exhibits a highdegree variability [5].The 3CLpro cleaves the polyprotein at 11 distinct sites, generating various non-structural proteins involved in the virus replication.The functional importance of 3CLpro makes it an attractive target for the development of eff ective antiviral drugs [6,7].3CLpro is considered as a validated potential drug non-toxic target for anti-coronaviruses inhibitors screening [2].Screening and repurposing of Food and Drug Administration approved antiviral drugs to inhibit 3CLpro may provide an alternative fast-track approach for identifying and developing new treatments for SARS-COV-2 infection [5].
Like other coronaviruses, the Three-Dimensional (3-D) structure of SARS-Cov-2 3CLpro consists of three domains.Domain I (8-101 amino acid residues) and II (102-184 amino acid residues) are essentially beta-barrels, and bear a resemblance to chymotrypsin.Domain III (201-306 amino acid residues) is mainly comprised of alpha-helices.The substrate binding region is located at the cleft of Domain I and II.In addition to the catalytic center, amino acid residues of subsites designated as S1 to S5, play a key role in natural substrate binding: Thr25, His163, Glu166, Cys145, Gly143, His172, Phe140, Met49, His41, Met165, Asn142 and Gln189.His41 and Cys145 3CLpro residues are considered to be critical for the virus activity.3CLpro/ligand dyad is crucial for the catalytic activity and essential in viral replication [5].
Currently, there are still no eff ective treatments of Covid-19.Development of new treatments may require months to years.So far, clinical trials have been quite disappointing, and drug toxicity issues have been raised, notably for chloroquine [8,9].Therefore, there is a rationale to evaluate whether treatments of natural origin from aromatic and medicinal plants have the ability to prevent and/or treat COVID-19 [4,10,11].
A joint research team of the Shanghai Institute of Materia Medica and Shanghai Tech University performed drug screening in silico, and enzyme activity test.
They identifi ed thus far 30 agents with potential effi cacy against COVID-19, including Western medicines, natural products, and traditional Chinese medicines.They also found that Chinese herbal medicines such as Rhizoma Polygoni Cuspidati and Radix Sophorae Tonkinensis may contain active ingredients against SARS-COV-2 [12].
Computational prediction by molecular docking is a reliable method to study interactions between candidate molecules and proteins.It is widely used in sillico experiments to screen drugs inhibition effi cacy.Molecular docking is a kind of bioinformatics modelling which involves the interaction of two or more molecules to give the stable adduct.Docking involves the placement of a ligand within a binding site and the prediction of the free energy of binding for such poses.In molecular docking, the most important aspect is the calculation of binding energy so as to fi t a ligand in a binding site.It is actually a reference method used to fi nd agent against nCoV-2019 [13].
Very few data is available on the clinical use of medicinal plants to prevent or cure COVID-19.However, scientists' teams worldwide (India, China, Iran, Morocco, sub-Saharian African) have published computer-based prediction studies of molecules from their own pharmacopoeia, mostly targeting the binding affi nity of the main SARS-CoV-2 3CLprotease [5,[14][15][16].Therefore, we evaluated in this despite reported antifungal and antioxidant properties [17].
In this study we searched to model G3'G, and study its binding affi nity so potential inhibition capability against SARS-CoV2 3CLpro mean protease as compared to other previously tested natural or pharmacological molecules.

Molecules
We selected pharmaceutical molecules included in the Guidelines (version 6) for treatment of COVID19 and molecules from medicinal plants.We defi ned a list of molecules allowing us to conclude about the eff ectiveness of our method and classify new candidate molecules.It is the reason why we also selected compounds that have already been computationally tested.We tested some drugs and medicinal plants molecules for the fi rst time by molecular docking (Table 1).
Other three-dimensional structures of molecules of interest were obtained from PubChem (https://pubchem.ncbi.nlm.nih.gov/).It is a chemical substance and biological activities repository consisting of three databases, including substance, compound, and bioassay databases [21].

3-D structures of the following molecules of interest
were obtained in pdf format: • • Modelisation: is not present in PDB norPubChem database.Using Opsin [22]we determined that G3'G international chromen-4-one,allowing us to convert it in SMILES.
We used also the "SMILES" of Crocin, and Lopinavir, which have no 3D model proposition.We use "-h et -gen3D" options of Openbabel software to obtain a 3-D model of these 3 molecules in pdf format [23].[28,31].

Consensus scoring (C-score):
To evaluate docking effi cacy, we measured the number of favorable intermolecular interactions such as hydrogen bonds and hydrophobic contacts.But to classify which ligands are most likely to interact strongly with the protein 3CLpro, we rank them according to a consensus scoring method [32].First, for each method we determined a score (S) such as: S =  -μ / .Where  is the value of free binding energy of a molecule; μ is the mean value of the set of molecules;  is the standard deviation observed for the set of molecules.Then, we normalized the score S to S', where S'= (S-Smin)/ (Smax -Smin).Where, Smax and Smin, are the maximum and the minimum S-score value of the set of molecules of a method.Finally, we attribute a Consensus-Score (C-score) for each molecule.This value is the sum of the three S'-score obtained for each molecule.Our ranking is based on the sorting of molecules respecting to their consensus-score.

Binding site determination
The determination of Amino Acids (AA) in the active site is used to analyze the Grid box and docking evaluation results.We identify residues of the active site by visualization of 3CLpro/native ligand linkages.We found several kinds of atomic links with critical AA of 3CL/Pro: • H-bonds: Gly143, Phe140, His163, His164, Glu166,

Molecular docking analyses
Three-dimensional models of G3'G, Crocin and Lopinavir, which were not previously available, could be obtained using Openbabel software, and were subsequently used in the docking analysis.We evaluated our products regarding Lupinski rules compliance and others Autodock 4 docking data (Table 3).Finally, the products analysed were presented according to their origin (natural origin or chemical synthesis), and their rank in consensus score of free binding energy.
The compounds with the 6 more favourable free binding energy were natural molecules from medicinal plants (Table 4).All of them are polyphenols: fl avonoids, fl avonol or fl avones.The best synthetic products in this ranking were

G3'G and derivatives docking
G3'G and other gossypetin derivates link at least one of the essential AA of binding site domains I and II of the catalytic center: Cys145 or His41.All others AA of the active site are brought into play according to the ligand but not all together.G3G and G8G are the only ligands that bind critical conserved His41 as well as Cys145.Glu166 is a recurrent linker through all gossypetin and derivatives.
They established thirteen bonds of several types.All the residues involved in the links with G8G are known residues of the active site.G7G links probably new and non-critical residues: Thr24, Thr26, Thr45, Cys44, Ser46 and His163.Quercitrin, G3'G and Gossypetin-3-Glucoside (G3G) establish respectively 9, 8 and 7 bonds with known residues of active site.We found that Quercitrin links 3 times with Met165.Figure 2 summarized all of those data.

DISCUSSION
We develop a computational method of prediction of best candidate inhibitors of SARS-CoV-2 His anti-SARS-COV-2 properties have never been explored.

Figure 1
Figure 1 Schematic representation of genome sequence organisation, encoding proteins and 3CPro-ligand structure.
study the inhibition of COVID-19 protease by herbal plants compound found in Martinique.Martinique is a tropical Caribbean island enjoying a large biodiversity.The population has a large use of local medicinal plants.Some proteins from Caribbean medicinal plants have already been identifi ed in protein database for molecular docking tests.Many others have not, such as Gossypetin-3'-O-glucoside (G3'G).This molecule, isolated from the petals of Talipariti elatum Sw.Found, almost exclusively in Martinique.It has no crystallography modelisation studies,

Lupinski's rule :
According to Lupinski's rule, effi cient drugs should have good permeation and oral absorption depending on respect of the following 5 physical and chemical properties: molecular weights > 500, C logP> 5, Giguet Valard AG, et al. (2020), J Biomed Res Environ Sci, DOI: https://dx.doi.org/10.37871/jbres1144 The affi nity of drug compounds is evaluated through the number and type (Hydrogen or hydrophobic type) of bonds, which occur with the active site of the protein.The lower is inhibition constant the more is affi nity.While desolvation energy and intermolecular energy are proportional to affi nity.Nictofl orin and Crocin present 3 violations.Chloroquine / Hydroxychloroquine and Ribavirin are the only drugs that show no violation.Among the natural compounds, Aloenin, Beta-eudesmol, Curcumin, Digitoxigenin, Kaempferol, Quercetin show no violation either.These are proteins from various classes: anthraquinones, sesquiterpenoid, polyphenol, cardenolide, fl avonol, and fl avonoid.G3'G/ Gossypetin-3-Glucoside (G3G)/ Gossypetin-8-Glucoside (G8G), Digitoxigenin, Quercitrin, and Cannabidiol among natural products; Remdesivir, Chloroquine and Ribavirin among synthetic products, had the best inhibition constants.All Gossypetins derivatives, Nictofl orin and Crocin presented each more than 10 H-bonds with the protein.Remdesivir has 13 H-bonds acceptor and only 4 H bonds donor.By matching ranking of the best intermolecular energy and desolvation energy ligands, we fi nd that G3'G/G8G, Cannabidiol and Chloroquine are all time present in the top 10 ranking.
Lopinavir and Nelfi navir (rank 7 and 10).Giguet Valard AG, et al. (2020), J Biomed Res Environ Sci, DOI: https://dx.doi.org/10.37871/jbres1144 confi rms the interest of polyphenols in the treatment of SARS-COV2 by inhibition of 3CLpro.In particular, the Gossypetin derivatives present in tropical Hibiscus varieties are for the 1st time classifi ed according to a predictive in sillico method, on the same level as Quercitrin or luteolin-7-Glucoside proposed by Indian research teams, and present a priori a better inhibitory potential than Lopinavir / Nelfi navir treatments against 3CLpro.The computational study that we have carried out confi rms the interest of fl avonoids in the treatment of SARS-COV-2 by inhibition of 3CLpro.It encourages research on local medicinal plants for the treatment of COVID-19.In particular, the secondary metabolites of Gossypetin extracted from tropical varieties of Hibiscus appear promising.Like all computational studies, our study should be supplemented with in vivo experiments to refi ne the therapeutic proposals.It could be the support for proposing local therapeutic or preventive trials and epidemiologic studies.To go further in proposing new treatments, the molecules derived from medicinal plants seem interesting.We could broaden our screening of local medicinal plants compounds candidates.We could target another SARS-COV2 element such as other enzymes or specifi c structural proteins.Molecular dynamic study could also be the next step to confi rm results or evaluate combination of therapeutics.In vivo experiments remain essential to confi rm therapeutic effi cacy.

Table 1 :
[30]set of molecules.Shows PubChem Id, chemical formula of molecules of interest, usual trade name of drugs and plants species origin.P means m edicinal plants origin and D means drugs., allocation of Gasteiger atomic partial charges and polar hydrogens were done with scripts from Auto Dock Tools Mgl tools 1.5.6:prepare_receptor4.py and prepare_ ligand4.py.The docking site on protein target was defi ned by establishing a grid box with a default grid spacing centered on the position of site native ligand.The docking analyses were performed with Autodock 4.2.6,AutodockVinaversion1.1.2,andSminaversionOct152019.AutoDock uses a Lamarckian genetic algorithm and an empirical free energy force fi eld scoring function[28].AutoDockVina employs an iterated local search global optimizer and an hybrid scoring function (empirical + knowledge based function)[29].Smina is a fork of AutoDockVina that uses a scoring function called Vinardo, with enhanced features based on AutoDockVina[30].To obtain free binding energy we made several per methods.We performed fi fty docking runs per ligand with Autodock 4.2.6 and Smina method.Only twenty runs with AutodockVina version 1.1.2which cannot do more [23]ensual method, in which binding sites are predicted from eight methods: LIGSITEcs (LCS), PASS (PAS), Q-SiteFinder (QSF), SURFNET (SFN), Fpocket (FPK), GHECOM (GHE), ConCavity (CON) and POCASA (PCS).The results are combined to maximize the rate of successful prediction.pH7.4 with Openable software[23].We generated several conformers and minimized theirs structures using MMFF94 force fi eld.We used only the lowest energy conformers.Native ligand has not been minimized.For both ligands and Giguet Valard AG, et al. (2020), J Biomed Res Environ Sci, DOI: https://dx.doi.org/10.37871/jbres1144protein

Table 2 :
Free binding energy (∆G) by docking methods and consensus scores.

Table 3 :
Other docking data regards to Lupinski rules violations.

Table 4 :
Rank of compounds regard to Lupinski's rules violation.
[17]abdariff a, T. elatum and T. tiliaceum); G3G or gossytrin (H.sabdariff a and T. tiliaceum) and G3'G (A. manihot and T. elatum).For the fi rst time G3'G was extracted and characterized from petals of fl owers of Talipariti elatum Sw. of Martinica[17].The high solubility of G3'G in the aqueous phase makes it, like many other molecules derived from medicinal plants, a good candidate for treatments.Its antifungal and antioxidant properties are already known.