However, given the widespread range in molecular weights of the chemicals in a data set e.g.,dimethylnitrosamine and benzoapyrene, molecular weights 74 and 252 Da, respectively, for SAR
Trang 1Applications of Substructure-Based
SAR in Toxicology
HERBERT S ROSENKRANZ
Department of Biomedical Sciences,
Florida Atlantic University,
Boca Raton, Florida, U.S.A.
BHAVANI P THAMPATTY Department of Environmental and Occupational Health, Graduate School of Public Health, University of Pittsburgh, Pittsburgh,
Pennsylvania, U.S.A.
1 INTRODUCTION
The increased acceptance of SAR techniques in the regulatoryarena to predict health and ecological hazards (1–6) hasresulted in the development and marketing of a number ofSAR programs (7) The approaches are of optimal usefulnesswhen they are employed as adjuncts to the appropriate
The authors have no commercial interest in any of the technologies described in this review.
309
Trang 212)] as opposed to satisfying predetermined rules [e.g.,DEREK, (13–15) ONCOLOGIC (16,17); ‘‘Structural Alerts’’(18)] It must, however, be made clear that human expertise
is very much involved in most aspects of these based substructural methods (8,9) Thus, the inclusion ofexperimental data into the ‘‘learning set’’ that forms the basis
knowledge-of any SAR model must adhere to previously agreed uponprotocols and data handling procedures (Fig 1) Moreover,prior to SAR modeling, the context in which the resultingmodel will be used has to be defined as it will affect themanner in which the biological=toxicological activities areencoded and the derived SAR model interpreted
Thus, it is commonly recognized (7,19) that the induction
of cancers in rodents is one of the most challenging ena to model by SAR techniques Yet, bearing in mind the
phenom-Figure 1 Outline of the SAR approach indicating the interactions with the human expert.
Trang 3complexity of the phenomenon and the regulatory context inwhich SAR predictions were to be used, Matthews and Con-trera (20) of the U.S Food and Drug Administration—byencoding the spectrum of activities, i.e., carcinogenicity inmale and=or female rats and=or mice and devising rules onhow the predictions were to be used—were able to develop ahighly predictive MULTICASE SAR model of rodent carcino-genicity It needs to be stressed that the success in developingthe model was primarily the result of the human insightbrought by the investigators (20).
2 THE ROLE OF HUMAN EXPERTISE
Substructure-based SAR approaches can handle databases inwhich activities are expressed categorically, i.e., active, mar-ginally active, inactive, or in a continuous scale However, it
is not always a matter of simply inserting data into the model.Thus, the database for the induction of unscheduled DNAsynthesis is indeed categorical (21) and allows the derivation
of a coherent SAR model (22) On the other hand, the nella mutagenicity database generated under the aegis of theU.S National Toxicology Program (23) requires insight intohow to express activities with respect to SAR modeling.Essentially, in that data set, each chemical is reported withrespect to its ability to induce mutations in five Salmonellatyphimurium tester strains in the presence or in the absence
Salmo-of several postmitochondrial activation mixtures (S9) derivedfrom rats, mice, and hamsters induced or uninduced with thepolychlorinated biphenyl mixture Aroclor 1254 (24) Each ofthe tester strains has a different specificity with respect toits response to mutagens Moreover, the exogenous S9 mix-tures may contain different levels of cytochrome P450 activat-ing and deactivating enzymes which may act on the testchemical and=or its metabolites If the purpose for deriving
a SAR model is to understand the basis of the mutagenicity
of a class of chemicals, then the Salmonella strain that
is the most responsive to that chemical class should beused [e.g., the mutagenicity of nitrated polycyclic aromatic
Applications of Substructure-Based SAR in Toxicology 311
Trang 4such instances, for SAR modeling, the human expert wouldselect the specific mutagenic potency (e.g., revertants=nmole=plate) reported for each chemical for the specific strain with orwithout S9 Moreover, based upon personal knowledge of thesystem and the specific class of chemicals, the expert wouldthen have to select a cut-off value between mutagens and mar-ginal mutagens, and between marginal mutagens and non-mutagens The expert would then be able to derive an equationrelating mutagenic potency to an SAR unit scale compatiblewith the SAR program being used (see below).
If, on the other hand, the purpose of deriving a SARmodel is to identify potential ‘‘genotoxic’’ (i.e., mutagenic) car-cinogens, which is the class of agents associated with risk tohumans (29–33), then one might consider deriving a dozen
TA 98, TA 1537, etc.) and then devise an algorithm to combinethe results of the different models into a single prediction [see
Refs (34) and (35)] This, however, is a tedious and suming process Moreover, ‘‘genotoxic’’ carcinogenicity hasnot been associated with either a response in a specific testerstrain or with the mutagenic potency in that strain Rather,the association is a qualitative one between carcinogenicityand mutagenicity in any of the strains and carcinogenicity
time-con-in rodents (36) Accordtime-con-ingly, consideration can then be given
to the paradigm that a response in any one of the testerstrains in the absence or the presence of a single S9 prepara-tion will be sufficient to identify a carcinogenic hazard More-over, since different tester strains may respond differentlyqualitatively as well as quantitatively to individual chemi-cals, the indications of potencies that are used cannot be con-tinuous In fact, they must be categorical and the expert may
Trang 5designate specific criteria for defining a mutagen, e.g., twicethe spontaneous frequency of mutations and a linear dose–response (37,38).
Depending upon an understanding of the tic=biological basis of activity, there have been variations onthe potency metrics Thus, the Carcinogen Potency Data Base
that in a lifetime study will permit 50% of the treated animals
mg=kg=day (39–41) However, given the widespread range
in molecular weights of the chemicals in a data set (e.g.,dimethylnitrosamine and benzo(a)pyrene, molecular weights
74 and 252 Da, respectively), for SAR studies that measureneeds to be transformed into mmol=kg=day in order to yield
a meaningful SAR model and the associated generation of
‘‘modulators’’ (see below) that affect the potency of the SARprojection
The human expert has to make a further decision: thedefinition of a ‘‘marginal carcinogen’’ and a ‘‘non-carcinogen.’’Should only chemicals inducing no cancers even at the maxi-mum tolerated dose (42–44) be considered non-carcinogens orshould there be a cut-off dose, above which even if tumors areinduced, they would not be considered biologically or toxicolo-
gically significant given the high dose needed? This would
reflect Paracelsus’ dictum ‘‘that it is the dose that makesthe toxin’’ (45)
For the purpose of SAR modeling of CPDB, we chose off values of 8 and 28 mmol=kg=day between carcinogens andmarginal carcinogens, and between marginal carcinogens andnon-carcinogens, respectively Based upon the characteristics
cut-of the MULTICASE SAR methodology wherein SAR units
19 indicate non-carcinogenicity; 20–29 marginal
On the other hand, the rodent carcinogenicity databasegenerated under the auspices of the NTP has been classified
Applications of Substructure-Based SAR in Toxicology 313
Trang 6carcino-Because the spectrum of activities as well as the cies reflect different aspects of the carcinogenic phenomenon,algorithms were developed to combine the results of thedifferent SAR models of rodent carcinogenicity into a singleprediction model (34,35) Although the approach usedheretofore is a Bayesian one (47), there is no reason tosuppose that other approaches (neural networks, geneticalgorithm, rule learners) are not equally effective (e.g., see
poten-Refs 48,49)
Obviously, this integrative approach is not restricted only
to SAR models of rodent carcinogenicity They could includeprojections obtained with other SAR models related tomechanisms of carcinogenicity, i.e., SAR projections of carci-nogenicity combined with the prediction of the in vivo induc-tion of micronuclei (50) and of inhibition of gap junctionalintercellular communication (51) Finally, the same approachcan be explored to combine SAR projections with the experi-mental results of surrogate tests for carcinogenicity (e.g.,induction of chromosomal aberration and of mutations at the
results from different SAR approaches, e.g., knowledge-based
Trang 7(e.g., MULTICASE) with rule-based [e.g., DEREK(13–15) or ONCOLOGIC] (16,17) is a promising avenue that
is worthy of further investigation
The point of the above examples is that human ity with an expertise in the biological phenomenon underinvestigation is essential for the maximal utilization of SARtechniques
familiar-Another instance in which human expertise wasessential for the development of a coherent SAR modelinvolves allergic contact dermatitis (ACD) in humans In thatendeavor, initial human insight was needed at several crucialsteps:
assumption, human and guinea pig ACD data werenot equivalent and could not be pooled to develop acoherent SAR model (52)
experi-mentally determined human ACD data degradedthe performance of the SAR model unless thenumber of independent ‘‘case reports’’ was greaterthan 7 (53)
challenge dose, the extent of the response, and theproportion of responders among challenged humanshad to be developed to provide a potency scale (54).When these pre-SAR processing experimental data hand-ling procedures were resolved, a coherent and highly predic-tive SAR model of human ACD was developed (54) Butagain, it required the participation and collaboration ofexperimental immunologists and SAR experts
The same considerations entered in developing othermodels, e.g., human developmental toxicity which dependedupon: (1) the acceptance of the results of an expert consensuspanel, and (2) the rejection of results of borderline signifi-cance (55) Of course, it was also the reason for the success
of the development of the aforementioned highly predictiveSAR model of rodent carcinogenicity by Matthews andContrera (20)
Applications of Substructure-Based SAR in Toxicology 315
Trang 8most widely accepted measure of a model’s performance isthe concordance between experimentally determined resultsand SAR-derived predictions of chemicals external tothe model This parameter, in turn, is a function of a model’ssensitivity (correctly predicted actives=total actives) andspecificity (correctly predicted inactives=total inactives).The most direct and preferable approach to determinethese parameters is to randomly remove from the learningset a number of chemicals to be used as the ‘‘tester set.’’ Theremaining chemicals can be used to develop the SAR model.The resulting models’ predictivity parameters and their sta-tistical significance can then be determined by challenge withthis external ‘‘tester set.’’
However, most frequently that approach cannot be takenwith respect to SAR models describing toxicological phenom-ena This derives from the fact that the performance of aSAR model depends upon its size (i.e., the number of chemi-cals in the database) (10,56–58) For most databases of toxico-logical phenomena, there is a paucity of experimental resultsfor chemicals Accordingly, the predictive performance of themodel will be negatively affected by removal of chemicals to
be used as the external ‘‘tester set.’’ Because of this tion, cross-validation and ‘‘leave-out one’’ approaches havebeen used (59) Thus, it has been demonstrated that the itera-tive random removal of chemicals (e.g., 5% of the total) andusing the remaining ones (i.e., 95%) as the learning set andrepeating the process (e.g., 20 times for a 5% removal), anddetermining the cumulative predictivity parameters are anacceptable approach (59)
considera-In most substructure-based SAR approaches, the cant structural determinant (e.g., biophores and toxicophores)
Trang 9signifi-identified will be a substructure enriched among active cals Accordingly, the presence of the toxicophore is associated
Fig 2)
While biophores=toxicophores are the significant as well
as the principal determinants of biological and toxicologicalactivity, toxicologists as well as health risk assessors are wellaware that not all chemicals in a certain chemical class aretoxicants even though the majority may be Thus, only83.3% of nitroarenes tested are Salmonella mutagens andonly 74.4% of chloroarenes tested are reported to be rodentcarcinogens (60) This situation is reflected in the fact that
c–cH¼ (Fig 2) are rodent carcinogens The question thenarises whether SAR approaches can be used to explain thisdichotomy as well as to provide a basis for the difference inprojected potencies In MULTICASE SAR, this discrimination
is provided by modulators (10–12) Thus each biophore=toxicophore is associated with a probability of activity and abasal potency For the illustration in Fig 2, the presence ofthe toxicophore is associated with a 75% probability ofcarcinogenicity and a potency of 50.3 SAR units Based
of 0.62 mmol=kg=day In MULTICASE, each biophore=toxicophore may be associated with a group of modulators(Table 2) which determine whether the potential for activity
is realized and, if so, to what extent Modulators are primarily
(Fig 4), or abolish (Fig 5) the potential potency associatedwith a toxicophore Additionally, the potential of a toxico-phore can be negated by the presence in the molecule of deac-tivating moieties that are derived from inactive molecules inthe data set The latter are not associated with chemicals that
In addition to being substructural elements, modulatorsmay also be physical chemical or quantum chemical in nat-ure Thus, the rat-specific carcinogenic toxicophore associatedwith the activity of the chloroaniline derivative shown in
Fig 7, which defines a non-genotoxic rat carcinogenic species,
Applications of Substructure-Based SAR in Toxicology 317
Trang 10Toxicophore no 1 is shown in Figs 1 – , 18 , and 19 , no 17 in Fig 18.
‘‘c’’ and ‘‘C’’ refer to aromatic and acyclic atoms, respectively; c indicates a carbon atom shared by two rings; O^indicates an epoxide; c 00 indicates a carbon atom connected by a double bond to another atom h3–Cli indicates a chlorine atom substituted on the thrid non-hydro-
In toxicophore no 18, the second carbon from the left is shown as unsubstituted This means that it can be substituted with any atom except hydrogen On the other hand, for this toxicophore, the last carbon on the right is shown with an attached hydrogen This means it cannot be substituted by any other atom but hydrogen Finally, in toxicophore no 10, the third non-hydrogen atom from the left is shown
as unsubstituted It can only be substituted by a chlorine atom.
Trang 11is modulated by 9 (water solubility of the chemical) In
greater the lipophilicity (i.e., the lower the water solubility)
of a chemical containing that toxicophore, the greater itscarcinogenic potency Mechanistically, this may reflect thatlipophilicity increases residence time in body tissues (e.g., sto-rage in adipose tissues) and thus augments the effective dose
An understanding of the nature of the toxicophores andassociated modulators can provide insight regarding themechanistic basis of the toxicity (see below) This knowledgecan also be used to modify the chemical’s structure in order
to decrease or abolish the unwanted toxic effects inherent in
Figure 2 Prediction of the carcinogenicity in rodents of dine The presence of toxicophore A is associated with a 75% prob- ability of carcinogenicity and a basal potency of 50.3 SAR units which corresponds to a TD50value of 0.62 mmol=kg=day [see Eq (1)] Applications of Substructure-Based SAR in Toxicology 319
Trang 12m-cresi-a beneficim-cresi-al molecule without m-cresi-affecting the lm-cresi-atter (m-cresi-also seebelow).
In addition to identifying toxicophores, MULTICASEalso has the capability of identifying substructures that,although not statistically significant, may be indicative of bio-
be scrutinized by the human expert to determine whetherthey are relevant to a carcinogenic potential Such an exami-nation should include a search of databases to determinewhether other chemicals containing that substructure areendowed with that or related potentials An in-depth study
of these ‘‘unique’’ structures is especially appropriate if it is
Trang 13derived from chemicals possessing great potency, e.g.,
Trang 14derives from the fact that the former deal primarily with generic chemicals while the latter are concerned with non-congeneric ones This is reflected by the fact that in medicinalchemistry one is most frequently dealing with a specific recep-tor or the active site of an enzyme (9) On the other hand, withrespect to toxicological phenomena, the same endpoint canarise as a result of a multitude of pathways and can be caused
con-Figure 4 The projected marginal potency of lenediamine The carcinogenic potency inherent in toxicophore A is greatly decreased by modulator B A carcinogenic potency of 27.1 SAR units corresponds to a TD50 value of 11.5 mmol=kg=day That potency is defined as ‘‘marginal’’ (see text).
Trang 152,6-dichloro-p-pheny-by many different classes of chemicals (e.g., carcinogenesis,development toxicity) Given that SAR methods used in toxi-cology must be able to handle many different chemical classeswithin a single data basis, it is essential that the method mustalso be able to identify chemical structures that do not fallwithin the domain shared by chemicals that give rise to acommon toxicophore MULTICASE accomplishes this in two
Figure 5 The prediction of the lack of carcinogenicity of 2, 20, 5, 50 tetrachlorobenzidine Although the presence of toxicophore A endows the molecule with carcinogenic potential, the presence of the inactivating modulators B and C abolishes it.
-Applications of Substructure-Based SAR in Toxicology 323
Trang 16ways: (a) by identifying differences in the molecular ment, and (b) by recognizing (‘‘unknown’’) structures that arenot present in the learning set under investigation.
environ-The presence of ‘‘unknown’’ moieties may be recognized
in molecules that contain recognized toxicophores In thatsituation, they have the potential of being modulators which
Figure 6 The projected lack of carcinogenicity of anthranilic acid The carcinogenic potential associated with toxicophore A is negated
by a deactivating moiety D derived from non-carcinogens external
to the molecules associated with the toxicophore.
Trang 17either augment or decrease the potential toxicity Hence, thepresence of such a moiety might introduce an element ofuncertainty in the prediction However, overall, that type ofuncertainty is taken into consideration when determiningthe predictive performance of the model, especially when across-validation approach is used.
Figure 7 Predicted carcinogenicity in rats of 3-(l,l,l,-trichloro-) propyl-p-chloroaniline The prediction is based on the toxicophore shown in bold The potency is modulated by (9 [water solubility]) The potency of 63.1 units corresponds to a TD50 value of 0.12 mmol=kg=day The analogous 3 propyl-p-chloraniline has a water solubility of 4.18 (i.e., it is less lipophilic) and this results in
a contribution of 37.4 for a projected potency of 49.5 SAR units
or a TD50 value of 0.54 mmol=kg=day, i.e., the decreased city results in decreased carcinogenic potency.
lipophili-Applications of Substructure-Based SAR in Toxicology 325
Trang 18On the other hand, chemicals may be devoid of able toxicophores and still possess an ‘‘unknown’’ moiety(Fig 9) In that situation, the unknown could possibly be atoxicophore that might endow the molecule with toxicologicalpotential When faced with such a situation, it is advisable toconduct a search for molecules external to the data set thatcontains such a moiety and are also devoid of toxicophore todetermine whether they have been tested in the same or arelated assay system Thus, for example, the chemical maynot have been tested for mutagenicity in Salmonella, but itmight have been tested for its ability to induce mutation in
identifi-E coli WP2 uvrA or error-prone DNA repair (37,38,61) ods for determining the relatedness of such assays have beendescribed (47,62) With respect to the molecule shown inFig 9, it has been reported that carcinogenic arylamine deri-vatives when substituted with sulfonates show decreasedintestinal absorption and hence abolish carcinogenicity(63–66), thus decreasing the level of concern that thesubstance in Fig 9 is a carcinogen
Meth-Figure 8 The identification of a moiety in niline that is present once in the data set However, the molecule containing it (tetrafluoro-m-phenylenediamine) is a carcinogen with
2,4-difluoro-N-methyla-a TD50 value of 0.50 mmol=kg=day Accordingly, this line derivative must be examined further.
Trang 19N-methylani-The identification of differences in the molecular onment is a more subtle exercise It might derive from thepresence of a toxicophore and a warning by the program that
ascertain the appropriateness of that determination requiresthe SAR system to be able to provide documentation, i.e., thenature of the chemicals that give rise to the toxicophore SARsystems that cannot provide that information are at a disad-vantage Thus ‘‘human’’ examination of the difference inenvironments between the test chemical described in Fig 10and the chemicals that gave rise to that toxicophore indicates
the program’s determination (Fig 10) is warranted
Figure 9 Prediction of the lack of carcinogenicity of 1,5 lenedisulfonic acid However, the prediction has an element of uncertainty because of the presence of the moieties ‘‘unknown’’ to the model It is known, however, that in other instances the sulfonate moiety facilitates excretion and thereby inhibits carcino- genicity (From Refs 63 to 66.)
naphtha-Applications of Substructure-Based SAR in Toxicology 327
Trang 20On the other hand, the determination of differences in
test chemical, 18-Crown-6 ether, can be biotransformed to
an acyclic structure that bears similarities to the structures
instance, the ‘‘human’’ expert overrules the SAR program’s
Figure 10 The prediction of the inability of 18-Crown ether-6 to induce sister chromatid exchanges in vitro The structure of 18- crown ether-6 is shown in Fig 11.
Figure 11 Structures which are the origin of the toxicophore associated with the induction of sister chromatid exchanges (see Fig 10) The four structures are clearly different from 18-crown-6.
Trang 21analysis and confirms the mutagenic potential of thatchemical.
Finally, even when the SAR program does not recognizedifferences in environments, the ‘‘human’’ expert may do so
nephropathy (67) by virtue of the presence of a toxicophore(Fig 14), which is present in six molecules of the data set,
nephropa-thy The SAR program does not detect a difference in ment (Fig 14) Yet, a comparison of the molecules in thelearning set with curcumin indicates that the molecular
of experimental data regarding the induction of this pathy by curcumin or structurally related molecules, the
nephro-Figure 12 The prediction of the potential of 18-crown ether-6 to induce mutations at the tkþ=locus of mouse lymphoma cells The structure of 18-crown ether-6 as well as of the seven molecules that gave rise to the toxicophore are shown in Fig 13 For an explana- tion of the structure of the toxicophore, see the legend in Table 1 Applications of Substructure-Based SAR in Toxicology 329
Trang 22prediction (Fig 14) is overruled This illustrates the need toexamine the basis of all SAR predictions.
As an additional example, we might examine the
Epitholone A, an inhibitor of tubulin polymerization, is a mising cancer chemotherapeutic adjunct that may have the
become resistant to Taxol (68,69)
However, examination of the basis of the prediction ofcarcinogenicity (Fig 16) indicates that the molecules in thelearning set containing that toxicophore all contain other moi-
associated with carcinogenicity Epitholone A does not containany of them Thus, in this instance, the toxicophore, albeit it isstatistically significant, is in fact an artifact Based upon theseanalyses, the ‘‘human expert’’ would agree with the SARmodel-generated prediction which is accompanied by a warn-ing regarding the ‘‘environment.’’ Obviously, in the above
Figure 13 Structures of molecules that are at the origin of the toxicophore associated with the potential to induce mutations at the tkþ=locus of mouse lymphoma cells.
Trang 23examples, the human expertise can only be maximally tive if the SAR method provides the necessary documentation.
effec-As mentioned previously, the predictive performance of
an SAR model is dependent upon the size and chemical sity of the chemicals in the learning set (56–58) It followsthat the number of predictions accompanied by ‘‘warnings’’
diver-of the presence diver-of ‘‘unknown’’ moieties will be a function diver-ofthe size of the learning set (57,58) This relationship can beexpressed as the informational content of an SAR model It
is defined as 100 Percent of Predictions Accompanied by
‘‘Warnings.’’ In practice, this value is determined by ging a SAR model with 10,000 chemicals representing the
challen-‘‘universe of chemicals’’ and determining the number of tions accompanied by such warnings (58) This also identifiesthe prevalence in the ‘‘universe’’ of moieties absent from themodel and suggests that experimental data on such chemicals
predic-be identified and the data included in a future model
Since SAR programs in use in toxicology may consist ofprepackaged programs and include specific SAR models,
Figure 14 An example of a prediction subsequently overruled The SAR model predicts that curcumin induces a2m-globulin asso- ciated nephropathy in male rats However, a comparison of the structure of curcumin with the structures of the six chemicals at the origin of the toxicophore (see Fig 15 ) indicates that they differ significantly In this instance, the human expert overruled the model’s prediction.
Applications of Substructure-Based SAR in Toxicology 331
Trang 24there is a tendency among some users not to evaluatefurther either the SAR paradigm resident therein or thepredictive performance of the resultant SAR model Thismay negate the usefulness of the methodology, its applic-ability to a specific situation, and its regulatory acceptance(6) Thus, not only must the predictive performance of amodel be known [i.e., concordance between experimentaland predicted results; sensitivity and specificity (determined
as previously described)], in order to make individual
Figure 15 Comparison of curcumin with the structures of cals that contain the same toxicophore (see Fig 14 ) The toxicophore
chemi-is shown in bold A: curcumin; B: 3,5,5-trimethylhexanoic acid (THMA); C: g-lactone of TMHA; D: 3,5,5-trimethylcyclohexanone; E: methylisobutylketone; F: isophorone; G: isobutyl ketone Chemi- cals B–G have been determined experimentally to induce a 2m-globu- lin-mediated nephropathy.
Trang 25predictions, but also in applying the projections to hazardidentification purposes or for the purpose of devisingrational combinations of SAR models or of a SAR modelcoupled with certain experimental assays so as to makethe exercise meaningful.
Moreover, in order to allow for maximal human input inthe analyses, it is not sufficient to receive a message that thetest molecule’s structure or domain is not fully covered by themodel Even, if the program indicates that the test moleculefalls with the domain, this may need verification Accordingly,the human, expert must know the nature of the chemicals in
Figure 16 Prediction of the carcinogenicity in mice of epitholone
A The structure of epitholone A (toxicophore shown in bold) is given
Applications of Substructure-Based SAR in Toxicology 333
Trang 26These considerations suggest that for optimal cability, SAR methods may be most useful if they are used toevaluate one chemical at a time rather than by submittingbatches of chemicals This approach is reinforced whenFigure 17 Structures of epitholone A and of chemicals which con- tain the toxicophore The toxicophore (see Fig 16 ) is shown in bold.
Trang 27appli-mechanisms of activity (see below) are also considered Theonly time batchwise SAR analyses may be warranted is forpriority setting but not for regulatory action (70,71).
In addition to being influenced by the number and nature
of the chemicals in the learning set, the predictivity of an SARmodel is also affected by the ratio of active to inactive mole-cules in the learning set Generally, a ratio of unity is optimal(10,72,73) Transparency of the SAR paradigm and knowledge
of the default assumptions may provide guidance on the mal database Yet, it must be realized that the experimentaldata used to obtain SAR models, in most instances, were notgenerated with SAR modeling in mind Accordingly, mostdatabases may not be fully optional for SAR model develop-ment On the other hand, knowledge of the predictivity para-meters of even less than perfect models makes theirdeployment for SAR analyses feasible It is also of interest tonote that some SAR methods may tolerate significant ambiguity
opti-in the experimental results used for model buildopti-ing and still beuseful for a purpose such as high throughput screening (74)
4 CONGENERIC VS NON-CONGENERIC
DATA SETS
One of the strengths of the currently available based SAR approaches is their ability to handle non-congene-ric databases (i.e., databases containing a mixture of classes).That is quite appropriate to the modeling of toxicologicalphenomena Thus, a phenomenon such as carcinogenesiscan be induced by many different chemicals (e.g., nitrosa-mine, and polycyclic aromatic hydrocarbons) and can proceed
substructure-by a variety of different individual or sequential pathways
On the other hand, this diversity in causative agents as well
as the multiplicity of mechanisms may ‘‘dilute’’ the learningset and result in SAR models of lower predictivity So onemight consider using these substructure-based approachesand applying them to congeneric data sets and possiblyimprove the predictive performance and refine the structuralinformation to better elucidate mechanisms This naturally
Applications of Substructure-Based SAR in Toxicology 335
Trang 28as substrate for further analysis (see also Ref 75) CASE, for example, did not choose simply the aromatic amine
toxico-phore is associated with a probability of activity and a basalpotency, i.e., 75% and 50.3 SAR units (Fig 2) Following theidentification of the toxicophore, the program identifies modula-tors which augment, decrease, or abolish the activity associated
An alternate approach would be to select the subset ofchemicals containing the specific toxicophore and use it toinitiate a fresh round of SAR model building However, beforeinvestigating this approach, let us consider the possibleadvantage of the normative approach of using non-congenericlearning sets Let us examine, for example, the aromaticamines illustrated earlier Thus, some chemicals in addition
second toxicophore derived from non-arylamine-containing
tox-icophore is in fact the one responsible for the carcinogenicspectrum Thus, most arylamine carcinogens induce cancers
in multiple species and multiple tissues This property, inaddition to their genotoxicity, makes them suspect as humancarcinogens (29) However, some arylamines have a muchmore restricted spectrum of carcinogenicity, i.e., a single tis-sue of a single gender of a single species (29) This makesthem much less likely to be a potential risk to humans (29–31) This more restricted activity may be related to the secondtoxicophore (which, in fact, may be derived from such non-genotoxic single-species rodent carcinogens) That type ofinformation will not be available when the learning set is
Trang 29Figure 18 The prediction of the carcinogenicity of 4-toluidine In addition to toxicophore A, this molecule contains toxicophore B which is derived from five non-arylamine carcinogens Based upon toxicophore B, the potency is 49.1 SAR units or a TD50 value of 0.73 mmol=kg=day; i.e., the potency based upon the second toxico- phore is lower On the other hand, the probability of carcinogenicity has been increased due to the presence of the two toxicophores Applications of Substructure-Based SAR in Toxicology 337
Trang 30used, the resulting model predicted p-aminobenzoic acid(pABA) to be a carcinogen (Fig 19) In all probability, this
Figure 19 The projected ‘‘carcinogenicity’’ of p-aminobenzoic acid based on the non-congeneric SAR model This physiological chemi- cal is unlikely to be a carcinogen A projection based upon the congeneric SAR model predicts this chemical to be non- carcinogenic (see text).
Trang 31physiological vitamin component is unlikely to possess thisattribute.
In order to determine whether the use of congenericchemicals improves the performance of the resulting SARmodel, we selected the 65 chemicals identified by MULTICASE(Table 1, toxicophore no 1; the chemicals are listed in Table 6.4
carcino-gens, 3 marginal carcinocarcino-gens, and 15 non-carcinogens) andused them as the learning set for a further MULTICASEmodel It is to be noted that since all of the chemicals contain
amine moiety would be the only major toxicophore This
the carcinogenicity of benzidine that is based upon the
toxicophore as responsible for a 75% probability of genicity; that model also used a modulator to increase theprojected potency to 67.9 SAR units On the other hand, theprediction based upon the non-congeneric model identified a
greater probability of carcinogenicity (i.e., 91% vs 75% for thenon-congeneric model) That is due to the fact that the toxico-phore is derived from a population enriched with carcinogens
It is also interesting to note that the potency associated withthis toxicophore (i.e., 67.1 SAR units) is close to that foundwith the non-congeneric model (67.9 SAR units, Fig 3) Thelatter, however, depended upon the contribution of a modula-tor Furthermore, the toxicophore derived from the congenericmodel (Fig 20) is in fact identical to the modulator associatedwith the prediction of benzidine based on the non-congenericmodel (Fig 3) This is not entirely unexpected given theMULTICASE paradigm However, this does not apply to theother toxicophores associated with the congeneric model
Interestingly, with this new SAR model pABA was dicted to be a non-carcinogen, i.e., none of the fragmentsderived from that molecule was a toxicophore Moreover,there were no warnings of the presence of unrecognized moi-
situa-Applications of Substructure-Based SAR in Toxicology 339
Trang 32Toxicophore 1–2–3–4–5–6–7–8–9–10 Fragments Inactives Marginals Actives Number
Toxicophore no 1 is shown in Fig 20 and no 3 is shown in Fig 21
For an explanation of the significance of the structural moieties, see legend to Table 1
Trang 33tion with the non-congeneric model (Fig 6), the program didnot recognize deactivating moieties external to the data set.However, based upon the modulators associated withthe new toxicophore, the molecule was predicted to be
Figure 20 Prediction of the carcinogenicity in rodents of dine based upon an SAR model of congeneric arylamines The toxi- cophore is shown in bold A potency of 67.1 SAR units corresponds
benzi-to a TD50 value of 0.08 mmol=kg=day The probability of genicity (i.e., 91%) is greater than the prediction (75%) obtained with the non-congeneric model ( Fig 3 ).
carcino-Applications of Substructure-Based SAR in Toxicology 341
Trang 34An in-depth SAR analysis of the carcinogenicity of mines optimally should include both types of SAR models, i.e.,congeneric and non-congeneric The latter may reveal alter-nate mechanisms of carcinogenesis, while the congeneric
greater statistical significance than the modulators associated
refinement in the understanding of the structural basis ofactivity Moreover, by predicting pABA as inactive, it providesreassurance regarding the predictivity of the congenericmodel Moreover, it is possible to combine the outputs of thetwo models into a single prediction (62)
5 COMPLEXITY OF TOXICOLOGICAL
PHENOMENA AND LIMITATIONS
OF THE SAR APPROACH
A single toxicological phenomenon may often occur as a result
of a series of independent and=or sequential events Inessence, this may have the net effect of having to model aseries of separate phenomena using a single database Thus,carcinogenicity may arise as a result of a somatic mutationinduced by an electrophile; mitogenesis secondary to a toxicinsult; tumor promotion by a variety of agents, some of whichare receptor-mediated; and a variety of other mechanisms thatare homeostatic or genetic in nature When results obtainedwith agents that induce cancers by these various mechanismsare pooled into a single database, as is the practice, the ques-tion arises whether the complexity of the phenomenon may
Trang 35Figure 21 Prediction of the lack of carcinogenicity of anthranilic acid based upon a congeneric SAR model The 81% probability of activity and the 37 SAR units of basal potency are not realized due to the presence of inactivating modulators, one of which (B) is shown These include an inactivating contribution due to the octa- nol:water partition coefficient Thus, the probability is reduced to 0% and the potency to 11.7 SAR units (equivalent to 80 mmol= kg=day) which is considered non-carcinogenic.
Applications of Substructure-Based SAR in Toxicology 343
Trang 36chemicals representing each contributing mechanism ever, a single cancer bioassay performed by currently
How-accepted protocols may cost $4 million and requires 3 years
to complete Moreover, societal concerns regarding the fare of animals would not permit such a use of animalresources and certainly not to improve the predictivity ofSAR models Thus, there is a need to explore other approaches
wel-to understand the limits of a SAR models There are, in fact,other approaches to determine whether a toxicological phe-nomenon is at the limit of the informational content of anSAR method’s resolution One can mix, for example, databasesdescribing rodent carcinogenicity and the induction of sensoryirritation in mice, develop a single SAR model from the com-bined data set, and challenge it with external tester sets of
irritants=non-irritants to determine the ability of the combined model todiscriminate between these phenomena (76)
Using such an approach with respect to MULTICASE, itwas demonstrated that there was sufficient reserve withinthe method and the currently available databases to modelfairly complex phenomena (e.g., mutagenicity, allergic contactdermatitis) In fact, the system has the capacity to modelphenomena twice (but not thrice) as complex as those cur-rently modeled (76) Thus, it would be feasible, when investi-gating a toxicological phenomenon, to perform a similarexercise provided the SAR methodology allows the operatorthe option to input databases
Of course, as mentioned earlier, there are otherapproaches to improve the predictive performance of SARmodels, e.g., by a thorough calibration of the input data, such
as was done by Matthews and Contrera (20), by combination
Trang 37of different SAR models describing different facets of a nomenon (e.g., SAR models of rodent carcinogenicity, ofunscheduled DNA synthesis and of the induction of chromoso-mal aberration), by combining SAR models that describe thesame phenomena but use different approaches [e.g., ONCO-LOGIC (16,17) and MULTICASE] or by combining the projec-tion of SAR models with experimental results obtained withsurrogate tests (e.g., a SAR model of carcinogenicity and theresults of tests for the in vivo induction of micronuclei) Thereare a number of protocols for combining such results: rulemakers (48,49), neural networks, genetic algorithms, andBayesian approaches We have obtained good results withthe latter (34,35,47).
phe-6 MECHANISTIC INSIGHT FROM SAR
phenomena are indications of the extent of mechanistic
Thus, there is extensive overlap between the phores associated with the in vivo induction of sister chro-mated exchange (MoSCE) and carcinogenicity in rodentsand mutations in Salmonella (SalmM) and no overlap withinhibition of gap junctional intercellular communication(iGJIC) (Table 4) This can be taken to indicate that the basis
toxico-of the induction toxico-of MoSCE is a genotoxic event (related toSalmM) and that this, in turn, is related to carcinogenesis
On the other hand, there is no significant toxicophore overlapbetween MoSCE and iGJIC, the latter being an ‘‘epigenetic’’(i.e., non-genotoxic) phenomenon par excellence (77) (Table 4).There is, however, also some overlap between MoSCE and celltoxicity This suggests that MoSCE can also occur, albeit
Applications of Substructure-Based SAR in Toxicology 345