DSpace at VNU: A Rational Workflow for Sequential Virtual Screening of Chemical Libraries on Searching for New Tyrosinas...
Trang 1Current Topics in Medicinal Chemistry, 2014, 14, 1473-1485 1473
A Rational Workflow for Sequential Virtual Screening of Chemical Libraries on Searching for New Tyrosinase Inhibitors
Huong Le-Thi-Thu1,*, Gerardo M Casañola-Martín2,3,4, Yovani Marrero-Ponce5,
Antonio Rescigno6, Concepción Abad2 and Mahmud Tareq Hassan Khan7
1 School of Medicine and Pharmacy, Vietnam National University, Hanoi (VNU) 144 Xuan Thuy, Cau Giay, Hanoi,
Viet-nam; 2 Departament de Bioquímica i Biologia Molecular, Universitat de València, E-46100 Burjassot, Spain; 3 Unidad
de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de
Farmacia, Universitat de València, Spain; 4 Centro de Información y Gestión Tecnológica, Ministerio de Ciencia
Tecno-logía y Medio Ambiente (CITMA), 65100, Ciego de Avila, Cuba; 5 Enviromental and Computational Chemistry Group,
Facultad de Química Farmacéutica, Universidad de Cartagena, Cartagena de Indias, Bolivar, Colombia; 6 Sezione di
Chimica Biologica, Dip Scienze e Tecnologie Biomediche, Università di Cagliari, Cittadella Universitaria, 09042
Mon-serrato (CA), Italy; 7 Present address: Holmboevegen 3B, 9010 Tromso, Norway
Abstract: The tyrosinase is a bifunctional, copper-containing enzyme widely distributed in the phylogenetic tree This
en-zyme is involved in the production of melanin and some other pigments in humans, animals and plants, including skin
pigmentations in mammals, and browning process in plants and vegetables Therefore, enzyme inhibitors has been under
the attention of the scientist community, due to its broad applications in food, cosmetic, agricultural and medicinal fields,
to avoid the undesirable effects of abnormal melanin overproduction However, the research of novel chemical with
anti-tyrosinase activity demands the use of more efficient tools to speed up the anti-tyrosinase inhibitors discovery process This
chapter is focused in the different components of a predictive modeling workflow for the identification and prioritization
of potential new compounds with activity against the tyrosinase enzyme In this case, two structure chemical libraries
Spectrum Collection and Drugbank are used in this attempt to combine different virtual screening data mining
tech-niques, in a sequential manner helping to avoid the usually expensive and time consuming traditional methods Some of
the sequential steps summarize here comprise the use of drug-likeness filters, similarity searching, classification and
po-tency QSAR multiclassifier systems, modeling molecular interactions systems, and similarity/diversity analysis Finally,
the methodologies showed here provide a rational workflow for virtual screening hit analysis and selection as a
promis-sory drug discovery strategy for use in target identification phase
Keywords: Drug-likeness filtering, molecular docking, QSAR modeling, similarity searching, tyrosinase inhibitor, virtual
screening
1 INTRODUCTION
Tyrosinase (monophenol monooxygenasse; EC
1.1.4.18.1) is a metalloenzyme oxidase widely distributed in
the phylogenetic tree This enzyme catalyze the two first
steps of melanin synthesis pathway, by the hydroxylation of
the L-tyrosine to 3,4-dihydroxyphenylalanine L-DOPA
(mo-nophenolase activity), and the posterior oxidation to
do-paquinone (diphenolase activity) [1] Because its main role
in melanogenesis, abnormal tyrosinase regulation it is related
with some skin diseases such as hyperpigmentation,
melasma (acquired hyperpigmentation), post in ammatory
melanoderma, solar lentigo, etc [2-4] Hence, tyrosinase
in-hibitors have been used largely as depigmenting agents for
treatment of these pigmentation disorders [5-7]
*Address correspondence to this author at the School of Medicine and
Pharmacy, Vietnam National University, Hanoi (VNU) 144 Xuan Thuy,
Cau Giay, Hanoi, Vietnam;
Tels: 53-33-223066 (Cuba) and 963543156 (València);
Faxes: 53-33-223066 (Cuba) and 963543156 (València);
E-mails: gmaikelc@gmail.com or gmaikelc@yahoo.es
In recent times, besides QSAR methodologies, there are other data mining techniques introduced in drug discovery with high accuracy levels [8] This successful data integra-tion is a complex theme in the desktop today´s researchers
In this sense the current drug discovery scenarios are intro-ducing standard workflow for screening structural chemical libraries [9] This issue presents advantages because com-pounds could be obtained by the direct purchase from the owners avoiding the time-consuming of synthesis or isola-tion process [10] Moreover, in the last times, the campaigns associated with massive virtual HTS are continuously in-creasing are gaining on accuracy in the prioritization process
of chemicals, due to the introduction of several ligand and structure based methodologies [11-12]
Therefore, to manage larger compounds libraries and
cover the expectations of the drug discovery process, in
silico virtual screening and computer-aided drug design have
become increasingly important [13] In our research group, this approach has been applied in the discovery of novel ty-rosinase inhibitors (TIs) as cycloartane [14-15], diterpenoi-dal alkaloids [16], tetraketones [17], coumarin [18], and so
1873-5294/14 $58.00+.00 © 2014 Bentham Science Publishers
Trang 2on In these studies, the classification QSAR-based virtual
screening (VS) has been employed for in-house congeneric
chemical libraries of different laboratories to identify TIs
from inactives Latter, new class of QSAR models named
potency models [19-20] were developed The last models
could be used as a cascade system together with the
classifi-cation models for more complete description of tyrosinase
inhibitory activity Besides, this type of models helps to
identify true positives and make an adequate process of
pri-oritization of compounds In recent work, all these models
were assembled in different multi-classifier systems (MCSs)
that improved the performance of QSAR methods [21]
By this means, in this chapter we present a procedure of
combining these and others different VS strategies in the
computational research for the selection/identification of
novel tyrosinase inhibitors This framework was employed
with efficacy to discover new chemical entities with
anti-tyrosinase activity Finally, is important to stand out that the
different virtual screening approaches mentioned comprises:
drug-likeness filters, similarity searching, classification and
potency QSAR multi-classifier systems, molecular docking
studies, and post-processing procedures as strategies; that
were assessed in a sequential manner over the Spectrum
Collection and Drugbank databases
2 TWO STRUCTURE CHEMICAL LIBRARIES:
SPECTRUM COLLECTION AND DRUGBANK
The ascending grown of computational resources have
brought a rapid increasing of structural chemical databases
either online or repository company The more interesting
examples are the huge ZINC and ChemSpider databases,
comprising 13 million and 26 million of compounds,
respec-tively This two chemical data sources are included between
the main sixty-four free databases nowadays [22] Therefore,
the huge structural chemical database screening is becoming
one of the hotter topics in compound retrieve using any
QSAR, ligand or structure data mining procedures In this
sense, some authors have included interesting updated
re-views about this topic [23-24] By this mean here we
pre-sented the results obtained over the Spectrum collection
(http://www.msdiscovery.com/spectrum.html) and
Drug-bank (www.drugbank.ca), which consists of 2 000 and 6 827
compounds, respectively, that were screened using a
sequen-tial strategy for virtual screening looking for potensequen-tial
thera-peutic chemicals for the treatment of hyper-pigmentation disorders A owchart depicting the various steps of virtual
screening including database ltration, similarity searching,
QSAR modeling, docking and clustering studies to prioritize
the virtual hits is shown in (Fig 1)
This Fig (1) displays the virtual screening stepwise
workflow which resulted in the discovery of novel scaffolds against the tyrosinase enzyme The protocol was based in the computational hierarchy of each filter, the consuming CPU time and the complexity of input information for each step This hierarchical procedure allows reducing the number of selected compounds (retrieved as novel TIs) gradually after each filter This mentioned strategy was employed to screen
virtually two databases (Spectrum Collection and
Drug-bank )
By other way, many of the chemical libraries as the case
of PubChem [25] on-line database are web-based systems with well recognize facilities to do some pre-processing
tasks in an easy way This is the also the case Spectrum
collection and Drugbank were some drug-likeness filters or
similarity searching methods are implemented as tools for search and retrieving Therefore some of these services were used in these studies
3 DRUG-LIKENESS FILTERS
The term “druglike” [26-29] is used for pharmaceutical
research to describe molecules with properties that fall within the boundaries delineated by the wide majority of pharmaceutical agents This process is associated with the
many possible molecular properties that most directly influ-ence the drug-like properties of a molecule in some specific type of research Lipinski et al.[30] defined the so-called
“rule of five” (sometimes abbreviated as RoF) in an effort to solve this question The main steps of this concept is the ex-amination of different parameters such as the number of
ro-tatable bonds (nRotB), polar surface area (PSA), log D, and
counts of nitrogen and oxygen atoms in an effort to define easily calculated properties that will be predictive of a
favor-able outcome and established mayor cutoff for these
physi-Fig (1). Sequential virtual screening workflow used in the identification of promissory TIs and the filtering of compounds involved in each
one of different steps from the Spectrum Collection and DrugBank databases
Trang 3cal-chemical properties and others [31-35] However the
threshold of Lipinski seems very rigid in occasions Hence
some scientist in this field have stand out and proposed other
diverse boundaries and criteria of drug-likeness filters [36]
Taking this into consideration, in our work we applied
supe-rior limits of all these filters In our case, a compound was
not taken into consideration in the next steps if it has the
molecular weight (MW) above 700 g/mol; the computed
octanol–water partition coefficient CLogP higher than 7;
the number of hydrogen bond donors (nHBDon) above 5
and acceptors (nHBAc) above 10; the number of rotatable
bonds (nRotB) higher than 10 and a polar surface area
(PSA) above 140 Å2 All these descriptors were calculated
with our in house TOMOCOMD-CARDD (acronym for
TO pological MOlecular COMputational Design -
Com-puted-Aided ‘Rational’ Drug Design) software These
mo-lecular descriptors are implemented in a new module
(DE-SPOOLs , acronym of DEScriptor POOLs) of our program
[37] that offers calculations of the several 0-3D indices,
which are calculated mainly using The Chemistry
Devel-opment Kit [38] By using the defined way above we
pro-ceed to the first filtering consisting in the application of the
criteria describe above on the Spectrum collection
data-base (http://www.msdiscovery.com/spectrum.html) This
first step also consists of reducing the number of chemicals
(negative design) employing the Drug-likeness filters These
are simple, fast and also allow “optimizing” in some way
simultaneously the potency and the pharmacokinetic [39]
So, we further sorted these 2 000 compounds using the
supe-rior boundaries of all filters reported in the literature and
nally 1 394 compounds were further considered for the next
step
4 SIMILARITY SEARCHING
Similarity searching identifies those database molecules
that are most similar to reference structures, using some
quantitative definition of intermolecular structural similarity
The reference structures and the molecules in the database
are characterized by one or more molecular descriptors
Their comparisons allow the calculation of a similarity
measurement between the reference structure and each of the
database structures, and the latter ones are then sorted into
order of decreasing similarity with the target The output
from the search is a ranked list in which the structures that
are calculated to be most similar to the reference structure
are located at the top of the list These chemicals will be
those that have the greatest probability of being of interest to
the user, given an appropriate measure of intermolecular
structural similarity The similarity methods are extremely
useful at the beginning of a drug discovery project, because
it needs little information about the target and only few
known active compounds Moreover, the implementations of
similarity methods are generally computationally
inexpen-sive, so searching large databases can be routinely
per-formed The result of this step is a focused library, since all
included compounds present common features a reference
compounds In our case the data fusion method [40] was
applied In Table 1, the structures of reference compounds
were given A hierarchical cluster analysis, k-NNCA, was
executed to visualize the distribution of reference
com-pounds in different groups In Fig (2), a dendogram for
these compounds is shown It can be seen, there is great structural diversity among these chemicals, which represent different molecular subsystems important for the tyrosinase activity
The set of 15 strong tyrosinase inhibitors of diverse struc-tures was selected as reference compounds The molecular
structures of these chemicals are given in Table 1
The MACCS fingerprints [41] were calculated to charac-terize reference structures and the ones of the database
em-ploying the program TOMOCOMD-CARDD software
These fingerprint are implemented in a new module
(MOLFIP, acronym of MOLecular FIngerPrints) of our
program [37] that offers calculations of the several finger-prints, which are calculated mainly using The Chemistry
Development Kit [38] The Tanimoto coefficient [42],
com-monly used for binary data, was computed to establish the metrics of intermolecular comparison (each compound with every other in its activity class) A specific database
mole-cule appears at rank position r ij (1 i n, 1 j 15) with a similarity measurement (scores), s j against every reference structure We used the fusion rule MAX for combining the similarity scores, so the final fused score was established as
shown in Equation 1
s f = Maximum {s 1 , s 2 …s 15} (1)
Later, each molecule of the database was sorted by its
fused score, s f The similarly active compounds in the top 30% of highest ranked data set compounds were retrieved
for the next step For the case of the Spectrum collection
database structures this procedure was applied resulting in the elimination of 1285 molecules The remaining 109 pounds are similar in some way to one of 15 reference com-pounds (positive design)
The Drugbank database (www.drugbank.ca) of 6 827
drugs was also screened using a similar procedure as above
In this case, first the similarity searching (data fusion by maximum score using 15 strong TIs as reference structures)
was applied, because DrugBank offers this option in the
management of its search database By this procedure were eliminated 6659 compounds representing the 97.54% of the chemicals in the database The repeated or reported against the tyrosinase compounds were removed and the remaining
ones were ltered using Druglikeness criteria mentioned in
the section above From this, 131 compounds were selected
and considered for the next step
5 MULTICLASSIFIERS GUIDED BY QSAR MODELS
In recent times, Quantitative Structure-Activity Relation-ships (QSARs) are the most widely used approach in drug design and have been applied successfully in the discovery
of novel tyrosinase inhibitors [18, 43-49] Hence, this method could constitute the principal “switch” for sequential workflow aiding to new lead compounds identification The binary QSARs for tyrosinase inhibitors are described in pre-vious reports [18, 43-49], therefore a brief approximation will be discussed here A first training set of 1072 com-pounds was collected with 526 chemicals classi ed as “ac-tive” (TIs) and 546 compounds as “inac“ac-tive” (non-TIs) The molecular structures and properties were correlated with
biological activity using TOMOCOMD-CARDD descriptors,
Trang 4Table 1 Structures of reference compounds in similarity study
O
O HO
1 Kojic acid
NH2 N
HO
2 L-mimosine
O OH
3 L-Tropolone
N OH
N O
4 N-cyclopenthyl-N-nitrosohydroxyl-amine
HO
OH O
O
5 Methyl ester of gentisic acid
O O
HO
6 Kurarinone
HN S
H2N
7 Phenyl-thiourea
N N
8 BP4
O O
O
O
OH O
H
O
HO
9 8´-epi-cleomiscosin A
O
O
10
HO
OH HO
HO
11 4-Prenyloxy- resveratrol
HN
S O
12 Alkyl-thiocarbamate E
NH
O
OH HO
OH HO
13
Benzylbenzamide 15
N NH S
NH2
14 3-Hydroxy -4-methoxy benzaldehyde
thiosemicarbazone
O
O
O O
NH 2
15 TK21
and different classification models were generated These
models enable the identification of TIs from inactive ones In
second place, the potency models were obtained using a
learning set of 257 strong TIs and 141 moderate-to-weak
compounds [19-20] The last ones would be used
hierarchi-cally with the models adjusted on the first database, for more
complete description of tyrosinase inhibitory activity
Afterward, we introduced other statistical techniques
[quadratic discriminant analysis (QDA), binary logistic
re-gression (BLR) and classification tree (CT) [20]] and many
machine learning approaches [support vector machine
(SVM), artificial neural network (ANN), Bayesian networks
(BNs), k-nearest neighbors (KNN) [19]], which enhanced
the performance of previous LDA-QSAR models in both
database Theses single classifiers can be used to make
ty-rosinase inhibitory activity depictions for new chemicals
However, many factors can affect the performance of those
classifiers Selecting the best available classifier is an option, but because the distribution of new chemicals that the classi-fier may meet during operation may vary (slightly or signifi-cantly depending on the application), this approach does not provide the best solution in all cases Furthermore, because many classifiers are generally tried before a single classifier
is selected, this approach also discards valuable information
by ignoring the performance of all the other classifiers [50]
By this aim, the combination of multiple classifiers has been proposed in the field of machine learning to improve the performance of the single classifier approaches [51-53] These multiple classifier systems (MCS) are based on the combination of several classifiers such that their union achieves higher performance than the stand-alone classifiers Hence, an ensemble of classifiers is a set of classifiers, whose individual classification decisions are combined in some way [54] Many studies have demonstrated that
Trang 5Fig (2). Dendrogram illustrating the results of the hierarchical k-NNCA of strong TIs used as reference compounds
ensembles often outperform their base models (the
compo-nent models of the ensemble) if the base models perform
well on novel examples and tend to make errors on different
examples [55]
In the case of tyrosinase inhibitor QSAR equations, to
in-crease performance demands in modeling tyrosinase activity
the individual models obtained were assembled in different
multi-classifier systems (MCSs) to improve their
perform-ance classifiers for tyrosinase inhibitory activity prediction
[21]
For the Spectrum Collection, the compounds found by
similarity searching were screened by QSAR (another
posi-tive design approach) using classification MCS based on
average probability (AP) [21] to identify new TIs Thus,
65/109 compounds were identified as active against
ty-rosinase enzyme It is important to highlight that most of
inactive compounds identified by QSAR were the last ones
of the list of 109 compounds The same occurs in the case of
the Drugbank database were 119/131 were identified as
ac-tives This justifies the selection of the cutoff value of
simi-larity searching The compounds identified by classification
QSAR were sequentially screened by potency QSAR using
boosting ensemble based on support vector machine [21]
This potency MCS identified 25 and 107 compounds from
Spectrum Collection and Drugbank, respectively as strong
TIs
6 MODELING MOLECULAR INTERACTIONS
SYS-TEMS
The next step was to use the molecular docking, that
con-sists of posing each ligand into the binding site of the target
This gives a predicted binding mode for each database
com-pound, together with a measure of the quality of the fit of the compound in the target binding site This information is used
to rank the compounds with a view to selecting and experi-mentally testing a small subset for biological activity [56] The docking calculations of strong TIs identified by QSARs
in the mentioned above studies were performed using the ICM™ docking module with the default setup as earlier mentioned [57-59]
6.1 Preparations of the Inhibitors and Target Molecules
The 2D structure of the compound (in mol file format) was converted to 3D and energy minimized at the 3D space
of ICM environment The atom types using local chemical environment, Merck Molecular Force Field (MMFF) [60-66] formal charges and 3D topology were assigned The lowest energy conformers of the compounds were then docked into the 3D space of the active site of the three dimensional struc-ture of Tyrosinase (PDB ID: 3NQ1)
All the docking calculations were performed using the
“interactive docking” menu at the ICM environment After docking the stack of docking poses were checked visually Multiple stack conformations were selected based on their docking energies, rmsd values (compared between the docked model and x-ray conformation) and similarities to closely related x-ray crystal structures from PDB Then the best conformations for the compound were finally chosen, and then the binding energy was calculated using ICM script For each of the individual docked complexes the free en-ergies of binding (Gcal) between the protein and ligand was
calculated using ICM script utilizing Equations 2 and 3[67]
Gcal. = GH + Gel. + Gs + C (2)
Trang 6Gcal. = GH +Gcol. +Gdes-sol. + Gs + C (3)
Here, GH is the hydrophobic or cavity term, which
ac-counts for the variation of water/non-water interface area
Gel is the electrostatic term composed of coulombic (Gcol)
interactions and desolvation (Gdes-sol) of partial charges
transferred from an aqueous medium to a protein core
envi-ronment The Gs is the entropic term which results from the
decrease in the conformational freedom of functional groups
buried upon complexation; and finally the C is a constant
accounts for the change of entropy of the system due to the
decrease of free molecules concentration (cratic factor), and
loss of rotational/translational degrees of freedom [67]
After preparation of docking process the strong TIs
iden-tified by QSARs were subsequently docked in the active site
of the tyrosinase (PDB ID: 3NQ1) using the ICM program
[57] In this study, only the chemicals selected of the
Spec-trum Collection were used on the structure-based study
Docking is an effective method for prioritizing ligands with
favorable interactions with the receptor and can also be seen
as a positive design For each case, the binding energy (BE)
was achieved and used as score to binding mode prediction
of the compounds First, we calculated the BR for a set of
strong, moderate-weak and inactive compounds with known
activity against the tyrosinase The docking molecular
inter-action process revealed that only one compound of the total
of 25 selected by QSAR did not complement favorably the
protein binding site This result showed a good
correspon-dence between QSAR and docking approaches
7 POST-PROCESSING ANALYSIS
Some methodologies for post-processing after sequential
virtual screening were assessed In our case, we selected the
k-NNCA (k-nearest neighbors cluster analysis) and k-MCA
(k-means cluster analysis) algorithms [68-69] to study the
similarity/diversity among the retrieved active compounds
and these latter ones with active compounds This two types
of Cluster Analysis (CA) were chosen because are a group of
methods capable to recognize similarities among cases
(ob-jects) or among variables and single out some categories as a
set of similar cases (or variables) Therefore it enables the
selection of novel scaffold for tyrosinase inhibitors Before
carrying out the cluster processes, all the variables were
standardized In standardization, all values of selected
vari-ables (molecular descriptors) were replaced by standardized
values, which are computed as follows: Std score = (raw
score - mean)/Std deviation
Finally, by a CA of the database active compounds and
the retrieved ones plus a detailed visual inspection, for the
case of Spectrum Collection, 19 out of 24 compounds were
selected to be evaluated experimentally It is important to
note that within the six compounds removed, some have
been reported in the literature activity against tyrosinase,
such as hinokitiol [70] and angolensine [71], while the
Spec-trum does not report itself This fact confirmed the
applica-bility of our protocol in the discovery of novel lead
com-pounds anti-tyrosinase from large databases Table 2 shows
traditional uses and values of different "in silico" studies, the
molecular structures of these compounds are given in Fig
(3)
On the case of Drugbank database, after cluster analysis and visual inspection we decided to select 32 compounds of the for enzyme assays The structures of these compounds
are shown in Table 3 and Table 4 shows traditional uses and
values of different "in silico" studies for these drugs
The flowchart in Fig (1) is a schematic representation of
the rational workflow sequential VS process with the number
of hits reduced for each screening step in both databases Using the sequential workflow, a total of fifty one putative novel TIs were successfully identified, which can be pur-chased and further evaluated in enzymatic experimental cor-roborations
As it can be seen in both cases, many compounds identi-fied as new TIs are already known drugs because and this avoids time-consuming to bring new drugs to market be-cause re-discovered drugs that are already in use and its pharmacokinetic and toxicological properties are well-known [72] This novel discovered drugs could be introduced into the market in the shortest time possible, thus accelerating the speed of discovery of new drugs for treating disorders of hyperpigmentation
8 FUTURE TRENDS ON WORKFLOWS FOR BIO-ACTIVITY PREDICTION
Bioactivity or any type of property prediction has always
be one of the challenges on data mining fields In the case of selection of adequate anti-target activity the main arduous task are mainly focused in the correct identification of lead compounds or promissory high activity chemicals that could lead to drug-like compounds after examining its ADMET properties Many predictive workflows has been showed in literature most of them focused on the use of 3D-QSAR COMFA, COMSIA and pharmacophore approaches together with docking studies [73-74] Because the drug discovery is
a highly complex and costly process, the integration in inno-vation, knowledge, information, technologies, expertise, in-vestments and management skills is required In this way, the multistep VS can help identify bioactive substances from
a large screening compound pool with limited experimental effort enabling to focus rapidly on the most promising can-didate structures
In the case of the specific workflow scenarios, some questions that remains unsolved were derived during writing this chapter, like the use of sub-workflows integrated by sev-eral Multi-Sequential Workflow responsible of each step for adequate drug-like properties, that is ADMET Moreover the consideration of other aspects concerning to workflow like the accuracy, sequence of combination, the most better quan-tity of sub-workflows to be used, and thresholds established for any workflow should be examined
Finally, these results offers a suitable alternative to the new era of open on-line chemical databases encouraging its use together with the ascending approaches based on the new technologies development such as massive computer calcula-tions algorithms and cloud computing could have a over-whelming impact on virtual screening procedures based on ensemble workflows to solve several questions that are still
in the route of drug discovery pipeline
Trang 7Table 2 Results of different in silico filters of VS protocol on the Spectrum Collection database
ID *
Compound Bioactivity MWa
LogP b
nHBDonc
nHBAd
nRotBe
PSA f
S f
Simi-larity Ranking
Pre-dicted class h
Esti-mated Potency i
Docking
BE k
(Kcal/ mol)
1500485 Phenytoin
sodium
Anticonvulsant, antieleptic 251.08 4.89 1.00 4.00 2.00 46.17 0.825 7 Act P -3.7
1503801 Naproxol
Antinflamma-tor, analgesic, antipyretic
468.21 2.35 0.00 6.00 5.00 95.34 0.816 10 Act P -2.2
1505130
3,4-Dimethoxy-cinnamic acid - 208.07 1.75 1.00 2.00 4.00 55.76 0.786 19 Act P -3.1
1505311
Diben-zoylmethane Antineoplastic 224.08 5.72 0.00 2.00 4.00 34.14 0.756 38 Act P -4.1
1505140
2',4-Dihydroxy-
3,4',6'-
trimethoxy-chalcone
- 330.11 2.18 2.00 1.00 6.00 85.22 0.742 49 Act P -6.6
1504152 Nilutamide Antiandrogen 317.06 3.22 1.00 4.00 3.00 92.55 0.735 59 Act P -3.3
1503032 Dipyrocetyl Antirheumatic,
analgesic 238.05 1.59 1.00 4.00 5.00 89.90 0.724 67 Act P -0.7
300610 Acetosyringone
Insect attractant, plant hormone
196.07 0.45 1.00 1.00 3.00 55.76 0.724 69 Act P -1.3
1504209 Diplosalsalate antipyretic analgesic, 300.06 4.12 1.00 4.00 6.00 89.90 0.724 71 Act P -1.5
1504118 Difractaic acid - 374.14 3.37 2.00 3.00 6.00 102.29 0.719 79 Act P -6.5
201448 4,4'-Dimethoxy
dalbergione - 284.10 2.88 0.00 3.00 5.00 52.60 0.710 96 Act P -5.5
2300228 Kainic acid
Glutamate receptor agonist, anthelmintic
213.10 -0.21 3.00 5.00 4.00 86.63 0.708 98 Act P -1.9
1505673 Troglitazone Antidiabetic 441.16 3.52 2.00 3.00 5.00 110.16 0.700 111 Act P -7.2
330032 Dicamba Herbicide 219.97 1.78 1.00 2.00 2.00 46.53 0.700 115 Act P -0.9
* ID=Code of Spectrum Collection;a MW = Molecular weight; b LogP = Computed octanol/water partition coefficient; cnHBDon = Number of hydrogen bond donors; dnHBAc= Num-ber of hydrogen bond acceptors; f PSA = Polar surface area; g S f =Fused Score for the maximum of the similarity; hAct =Active against the tyrosinase identified by Clasiffication MCS
QSAR; iP = Potent inhibitor of tyrosinase identified by Potency MCS QSAR; k BE =Binding energy (PDB ID: 3NQ1)
Trang 8Fig (3) Molecular structures of the identified virtual hits from Spectrum Collection by VS protocol
Table 3 Molecular structures of virtual hits identified on Drugbank by current VS protocol
H
OH
O O O
O
DB00227
H O
H O HO
DB00486
H
OH
O O O O
DB00641
O O
OH O
HO
DB00769
OH
HO
O O
O
DB00929
H
O O
O HO
DB02205
O
HO O
O
O
OH
O
DB02329
O HO
DB02699
HN
O
F
OH
Br
DB02880
OH O O
DB03007
HO
O
OH
OH
DB03451
H
H O O
H
H H
HO
OH
O
DB03785
H O
OH
H O
O
OH
DB04324
H HO
HO
OH O O
DB04376
H
O O
DB04392
O
O N O
DB04599
Trang 9(Table 3) contd…
HO
OH
DB04641
F
H
H O O
HO
OH
DB07036
O N
DB07123
H OH O
H
HO O
H O
DB07177
O
O OH
OH
DB07500
H OH
O H
O N H
HO
DB07567
H H
H HO H
H H
O
DB07703
SH O
HN N
DB07734
O
O N
H
O
SH
DB07735
H
H
NH2
O H
DB07883
H H
H O
OH HO
DB07933
H O
OH O
HO
H H
DB08020
H
OH
O O H H
H
H
OH
H
DB08224
O
HO HO
DB08442
H
O O
O HO
OH
DB08517
H
H H O
HO HO
DB08737
Table 4 Results of different in silico filters of virtual screening protocol on the DrugBank database
ID *
LogP b
nHBDonc
nHBAd
nRotBe
PSA f
Similarity fused score
Predicted class h
Esti-mated potency i
DB00227 Lovastatin approved,
DB00486 Nabilone approved, inves-tigational 372.27 6.155 1 1 6 46.53 0.573 Act P
DB02205
6-(1.1-Dimethylallyl)-2-(1-
Hydroxy-1-Methylethyl)-
2.3-Dihydro-7h-Furo[3.2-G]Chromen-7-One
Trang 10(Table 4) contd…
ID *
LogP b
nHBDonc
nHBAd
nRotBe
PSA f
Similarity fused score
Predicted class h
Esti-mated potency i
DB02880
N-[1-(4-Bromophenyl)-Ethyl]-5-Fluoro Salicylamide experimental 337.01 2.77 2 2 4 49.33 0.551 Act P
DB03451
1alpha.25-Dihydroxyl-20-
Epi-22-Oxa-24.26.27-Trihomovitamin D3
experimental 460.355
DB03785
(3r.5r)-7-((1r.2r.6s.8r.8as)-
2.6-Dimethyl-8-{[(2r)-2-
Methylbutanoyl]Oxy}-
1.2.6.7.8.8a-Hexahydrona-
phthalen-1-Yl)-3.5-Dihydroxyheptanoic Acid
DB04392
Bishydroxy[2h-1-
Benzopyran-2-One.1.2-Benzopyrone] experimental 336.06 0.156 0 4 2 86.74 0.573 Act P
DB04641
3.7-Dihydroxynaphthalene-2-carboxylic acid experimental 204.04 0.914 3 2 1 77.76 0.659 Act P
DB07036
(3aS.4R.9bR)-2.2-difluoro-4-
(4-hydroxyphenyl)-6-
(methoxymethyl)-
1.2.3.3a.4.9b-hexahydro-cyclopenta[c]chromen-8-ol
DB07123
n-(4-methylbenzoyl)-4-benzylpiperidine experimental 293.18 4.138 0 2 4 20.31 0.556 Act P
DB07177
(5e.13e)-11-hydroxy-9.15-dioxoprosta-5.13-dien-1-oic
acid
DB07500
(2E)-1-[2-hydroxy-4-meth-
oxy-5-(3-methylbut-2-en-1-
yl)phenyl]-3-(4-hydroxy-phenyl)prop-2-en-1-one
DB07567
(2r.3r.4s)-3-(4-
hydroxyphenyl)-4-methyl-2-[4-(2-pyrrolidin-1-ylethoxy)
phenyl]chroman-6-ol
DB07703
(3r.4s.5s.7r.9e.11r.12r)-12-
ethyl-4-hydroxy-3.5.7.11-
tetramethyloxacyclododec-9-ene-2.8-dione
DB07734
N-(1-benzylpiperidin-4-yl)-4-sulfanylbutanamide experimental 292.16 2.058 1 3 7 71.14 0.708 Act P
DB07735
N-[1-(2.6-
dimethoxybenzyl)piperidin-4-yl]-4-sulfanylbutanamide