Approches structure-activité ou (Q)SAR pour l évaluation du risque toxicologique au cours du développement des médicaments Véronique Thybaud (1), Alexander Amberg (2) Sanofi Drug Disposition, Preclinical Safety and Animal (1) Vitry sur Seine, France (2) Francfort, Allemagne 1
Acknowledgements Sanofi, DSAR, Frankfurt, Germany Preclinical Safety Andreas Czich In silico toxicology team Alexander Amberg, Co-Author Annette Mann Melanie Saalfrank Friedemann Schmidt Uta Schrotberger Hans-Peter Spirkl Manuela Stolte Salim Arslan (University of Applied Sciences, Giessen, Germany) 2
Presentation Outline Principles of (Quantitative) Structure Activity Relationship and in silico / computational prediction of toxicity Application of in silico Toxicology in drug Use of in silico analysis for the evaluation of genotoxic impurities in drug Conclusion and Perspectives. 3
Presentation Outline Principles of (Quantitative) Structure Activity Relationship and in silico / computational prediction of toxicity Application of in silico Toxicology in drug Use of in silico analysis for the evaluation of genotoxic impurities in drug Conclusion and Perspectives 4
In Silico / Computational Toxicology and (Quantitative) Structure Activity Relationship In silico / Computational Toxicology Prediction of toxicity with the help of a computer (in silico: on computer). (Q)SAR or (Quantitative) Structure Activity Relationship Based on the assumption that molecules with similar chemical structure(s) have similar properties e.g. toxicity, physicochemical properties, pharmacological activity Molecular (sub)structures can be translated into chemical descriptors to be used for computational toxicology. 5
Examples of chemical/molecular descriptors 0D descriptors Atom & bond counts Diazepam C 16 H 13 ClN 2 O O N 1D descriptors N Physico-chemical properties, experimental measurements (MW, logp, pka, ionization) 2D descriptors Fragment descriptors, distance matrix 3D descriptors steric, volume, surface area, quantum O R N H N R R R O R R R R R R N Cl N Cl Cl chemical descriptor (HOMO, LUMO), molecular topology 4D descriptors Pharmaco-toxicophore docking models Molecular interaction fields 6
Tools for (Q)SAR and in silico / Computational Toxicology Toxicity Databases (also used as Trainings dataset) Repository of robust and reliable toxicity data preferably covering a broad chemical space For searching and determining the relationship with chemical structures Prediction software/systems Models that predict toxicological effects based on the chemical structure Rules or knowledge based (i.e. presence of structural alerts) Statistically-fragment based Quantitative SAR Decision and regression trees Artificial intelligence computer methods/software. 7
Presentation Outline Principles of (Quantitative) Structure Activity Relationship and in silico / computational prediction of toxicity Application of in silico Toxicology in drug Use of in silico analysis for the evaluation of genotoxic impurities in drug Conclusion and Perspectives 8
Origins of adverse/toxicological effects and the different fields for in silico predictions Primary (on-)target (pharmacological action) Target liability assessment Potential exaggerated pharmacological activity Chemical and/or metabolite reactivity (molecular structure) Genotoxicity Carcinogenicity Organ toxicity Developmental/ Reproductive toxicity Biotransformation/ Metabolites Origins of adverse/ toxicological effects ADME (Absorption, Distribution, Metabolism, Excretion) Transporters Secondary (off-)target (pharmacological action) Off-target models herg ~ 1700 others targets, including kinases Physicochemical properties Phospholipidosis Phototoxicity All above adverse effects could potentially be analyzed with SAR approaches. 9
Application of In Silico / Computational Toxicology in Drug RESEARCH DEVELOPMENT MARKET Target Identification Lead Identification Candidate Selection Preclinical Phase I/IIa Phase IIb Phase III Regul. Review Phase IV Target liability assessment Toxicological liabilities caused by primary (on-) target pharmacological action in different organs/species (i.e., exaggerated pharmacological activity) Target gene expression and pathway analyses (bioinformatics) Systems biology approach to identify organ toxicity Databases, data on knock-out mice and human mutations. 10
Application of In Silico / computational Toxicology in Drug RESEARCH DEVELOPMENT MARKET Target Identification Lead Identification Candidate Selection Preclinical Phase I/IIa Phase IIb Phase III Regul. Review Phase IV Lead Optimization and candidate selection In silico Toxicology to identify potential toxicological liabilities of compounds to focus on during the candidate optimization program for more safer and less toxic drug candidates Structure-related early safety assessments for different toxicity endpoints, especially for genotoxicity Help selection of assays/tests (in vitro/in vivo) for further investigations before candidate selection. 11
Application of In Silico / computational Toxicology in Drug RESEARCH DEVELOPMENT Target Identification Lead Identification Candidate Selection Preclinical in silico evaluation of chemical series for Genotoxicity HERG Hepatotoxicity Phospholipidosis Phototoxicity Genotoxicity (lead series) selected based on in silico results Ames in vitro micronucleus Safety Pharmacology Biochemical Toxicology Genotoxicity in silico Ames in vitro micronucleus Safety Pharmacology Reproductive Toxicology Biochemical Toxicology 12
Application of in Silico / computational Toxicology in Drug RESEARCH DEVELOPMENT MARKET Target ID Lead Identification Candidate Selection Preclinical Phase I/IIa Phase IIb Phase III Regul. Review Phase IV Exploratory toxicology / risk assessment Understanding of toxicity mechanisms, identification of possible follow up testing, and supportive information for regulatory decisions In silico analysis of unique/major human metabolites Occupational / worker safety and exposure band (OEB) classification in silico analysis of synthesis intermediates (e.g., genotoxicity, skin sensitisation) Evaluation of potential Genotoxic impurities (GTI ). 13
Example of genetic toxicology studies in drug development DISCOVERY DEVELOPMENT MARKET Lead Identification Candidate Identification Preclinical Phase I/IIa Phase IIb Phase III Regul. Review Phase IV marketed cpds First in man package: Ames test In vitro and in vivo chromosome damage assays in mammalian cells Evaluation of potential major/unique human metabolites: case by case, generally in silico and in vitro assays Evaluation of potential genotoxic impurities (and synthesis intermediates for occupational safety): In silico analysis Ames test (and potentially for occupation safety chromosome damage assays) Impurities: Refinement of the evaluation of potential genotoxic impurities Qualification, if needed Human metabolites: In case of major/unique Findings in carcinogenicity: May request non-standard Studies Occupational safety: Safety data sheet, precaution handling In case of Issues: Impurities New data In silico systems and SAR approaches could potentially be used for all steps in cases of unexpected genotoxic effects. 14
Rationale for the use of in silico models in the evaluation of genotoxicity DNA reactivity and Structural alerts -In silico analysis (e.g. DEREK, Multicase, Leadscope) Gene mutations - Ames test in bacteria - Assays at hprt and tk loci in mammalian cells - Transgenic animals Chromosome damage - Structural and numerical chromosome aberrations - Micronuclei Exposure & Metabolic activation Electrophilic reactive product & metabolite Interactions with DNA [and/or other cellular components] Apoptosis DNA primary damage Replication MUTATION If other keys events occur DNA repair Adverse effects on human health Somatic cells, Germ cells DNA primary damage - DNA adducts - Comet assay (DNA strand breaks) - Unscheduled DNA synthesis (DNA repair) 15
Presentation Outline Principles of (Quantitative) Structure Activity Relationship and in silico / computational prediction of toxicity Application of in silico Toxicology in drug Use of in silico analysis for the evaluation of genotoxic impurities in drug Conclusion and Perspectives 16
Guidelines for genotoxic/mutagenic impurities in drugs ICH* Q3 A/B for qualification of impurities e.g. at least two in vitro genotoxicity assays when above 0.15% EMA (European agency) Guideline in the limits of genotoxic impurities - June 2006 EMA Question & Answers documents June 2008; December 2009; September 2010 ICH M7 under preparation ICH guideline M7 on assessment and control of DNA reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk. (Step 2/3 doc - February 2013) *: International Conference on Harmonization 17
General strategy for the evaluation of potential genotoxic (mutagenic) impurities 1. List all compounds involved in synthesis as well as degradants and potential by-products 2. Genotoxicity evaluation based on in-house available data, literature and in-silico analysis Non-genotoxic/non-carcinogenic compound based on robust data 3. No data available In silico analysis needed Genotoxic/carcinogenic compound based on robust data 3. Structural alert for mutagenicity? Yes No negative Ames test positive 4. Control as usual impurities (i.e. ICH Q3) 4. Analytical work to secure contents below Threshold of Toxicological Concern or compound specific values (PDE or ADI*) *: Permitted Daily Exposure or Acceptable Daily Intake 18
Threshold of toxicological concern EMA guideline and Q&A for clinical development and registration 120 µg/day (1 day) 60 µg/day (<1 month) 20 µg/day (<3 months) 10 µg/day (<6 months) 5 µg/day (<12 months) 1.5 µg/day (>12 months/registration) Calculated daily dose as a function of treatment days for 10-5 risk of cancer (from ICH M7 Step 2/3 document) 19
In Silico Toxicology for genotoxic impurities (GTI) analysis in drugs Currently accepted standard approach for impurities of clinical batches (e.g. below 0.15%) Absence of structural alerts for mutagenicity drives no further action Compound considered to be non mutagenic Probably the first time that the result of an in silico analysis is used to conclude a compound as devoid of toxic activity and to stop testing Up to now in silico analysis was used to conclude a compound as toxic by default without testing in case of alerts e.g. Sanofi policy for occupational safety at early stages of development Requests a high confidence in a negative prediction High negative predictivity and sensitivity. 20
In Silico assessment of potential GTIs Name ESIS ECHA IARC ICSAS IRIS NTP TOXNET IPS INCHEM JECDB RTECS DEREK (Knowledge based) European chemical Substances Information System, incl. CMR * class. (EU) European Chemicals Agency database for REACH International Agency for on Cancer, incl. carcinogenicity classification Informatics and Computational Safety Analysis Staff databases from the FDA Registry of Toxic Effects of Chemical Substances Leadscope (statistical based) echa.europe.eu Integrated Risk Information System (EPA), incl. carcinogenicity & mutagenicity class. www.epa.gov/iris National Toxicology Program Toxicology Data Network International Programme on Chemical Safety Japan Existing Chemical Data Base CPDB Carcinogenic Potency Database PharmaPendium Database with toxicity data from FDA and EMA approval document Leadscope VITIC Database Description In silico assessment of potential GTIs Database with toxicity data extracted from public databases/sources/journals Database with toxicity data extracted from public databases/sources/journals Other sources from toxicological journals, databases, data safety sheets etc. Link esis.jrc.ec.europa.eu www.iarc.fr www.fda.gov/aboutfda/centersoffices/officeofmedicalpr oductsandtobacco/cder/ucm092150.htm ntp.niehs.nih.gov toxnet.nlm.nih.gov www.inchem.org/ dra4.nihs.go.jp/mhlw_data/jsp/searchpageeng.jsp www.cdc.gov/niosh/rtecs/default.html potency.berkeley.edu Commercial (www.pharmapendium.com) Commercial (www.leadscope.com) Commercial (www.lhasalimited.org/vitic_nexus) Prediction systems Toxicity databases * Carcinogenic, Mutagenic and Reprotoxic 21
Example In Silico Toxicology analysis Compound: Azo dyes like Aniline Yellow R 1 N N Background Widely used as dyes for pigments, textiles, cosmetics, foods, etc. Many are regarded as non-toxic, although some have been found to be mutagenic and carcinogenic R 2 Aromatic azo dyes N N Aniline Yellow NH 2 In Silico Toxicology analysis by Database search VITIC as database example Prediction software DEREK: example of a knowledge-rule based system Leadscope: example of a statistically-fragment based system 22 22
VITIC - Search by identifiers (name, CAS number, other IDs) Toxicity endpoints Compound information (e.g. name, CAS) Detailed toxicity data, organized in predefined fields 23
VITIC - Search by structure, substructures and structural similarity Search by structure (similarity, substructure) All similar compounds tested positive in Ames Combination of different search terms possible 24
Prediction expert systems for the in silico assessment of potential GTIs Knowledge-rule based system Existing toxicological knowledge implemented as rules into expert systems Rules/alerts/toxicophores Chemical structural features /substructures /scaffolds known to be associated with a given toxicological activity Toxicity prediction is based on the presence of structural alerts/toxicophores Examples DEREK (Deductive Estimation of Risk from Existing Knowledge) ToxTree; OpenTox; OECD QSAR toolbox N N NH 2 Mutagenicity Alert: aromatic azo compound From other example it is known that aromatic azo compounds can cause a positive test result in the Ames assay 25
Example of a knowledge-rule based system for prediction: DEREK references + other information example of compounds list of DEREK alerts Background information for each alert: Structure Activity (Toxicity) Relationship 26
Prediction expert systems for the in silico assessment of potential GTIs Statistically-fragment based systems Training dataset of compounds with known toxicity needed Structural fragments that statistically correlate to toxicological activity are identified from the training dataset Prediction is based on statistical comparison with fragments from the training dataset Examples Leadscope MultiCASE SciQSAR O N N OH S O N O N OH S O N NH 2 N N O N OH S O N N NH 2 27
Example of statistically-fragment based system for prediction: Leadscope Ames model Ames model List of fragments/ toxicophore Details of statistical analysis 28 Positive compounds from the training dataset having the same toxicophore and results 28
Example of statistically-fragment based system for prediction: Leadscope not in domain": compound not covered by the training dataset Leadscope models Predicted compound Negative prediction: probability < 0.5 Positive prediction: probability > 0.5 29 29
Importance of Expert Evaluation/knowledge Critical review needed For the prediction results from the different systems/software For the supportive data from literature and public databases Assessment of the relevance of structural alerts for mutagenicity e.g. possible de-risking rules for some chemical sub-classes Read-across approach Compounds from the training dataset, that are structurally similar, have common structural features or belong to same chemical category Use of in-house data combined with public data For the development of complementary in-house in silico / computational systems/rules. 30
Validation study for the in silico assessment of potential GTIs The use of one or more in silico systems to identify structural alerts for mutagenicity A reliable approach to evaluate mutagenicity of potential GTIs The knowledge-based DEREK, the most used system Appropriate for GTI evaluation (high negative predictivity and sensitivity) DEREK prediction can be further improved by using Public and In-house database search, other expert assessment approaches A secondary statistically-based system like Leadscope, or MultiCASE based on public/in-house data. 31
Validation study for the in silico assessment of potential GTIs: Example of Sanofi validation Test dataset: all potential GTIs tested at Sanofi between 2009 2011 (269 compounds) Ames positive: 39 (15%); Ames negative: 230 (85%) Models used for in silico prediction of mutagenicity Public models (FDA Leadscope Salmonella 2010): 3600-7500 cpds Internal models: Ames results from ~ 4200 compounds tested at Sanofi from 1990-2008 Prediction Test result N= TP+FN+ FP+TN positive negative positive True Positives (TP) False Positive predictions (FP) Positive Predictivity = TP/(TP+FP) % correct positive predictions negative False Negative predictions (FN) True Negatives (TN) Negative Predictivity = TN/(FN+TN) % correct negative predictions Sensitivity = TP/(TP+FN) Specificity =TN/(FP+TN) Concordance = TP+TN/( N) % correctly predicted positive compounds % correctly predicted negative compounds % correct overall predictions 32
DEREK (knowledge-based system) alone Ames Results Validation study for the in silico assessment of potential GTIs: Example of Sanofi validation pos. + neg. - DEREK-Prediction mutagenic 28 (true positives) 69 (false positive prediction) Predictivity 29% not mutagenic 11 (false negative prediction) 161 (true negatives) Predictivity 161/172=94% Sensitivity 72% Specificity 70% Concordance 70% DEREK + statistical system Leadscope (public Ames model) Ames Results pos. + neg. - DEREK + LS Prediction mutagenic 30 (true positives) 79 (false positive prediction) Predictivity 28% not mutagenic 9 (false negative prediction) 151 (true negatives) Predictivity 151/160=94% Sensitivity 77% Specificity 66% Concordance 67% DEREK + database search (expert knowledge) Ames Results pos. + neg. - DEREK + literature mutagenic 31 (true positives) 65 (false positive prediction) Predictivity 32% not mutagenic 8 (false negative prediction) 165 (true negatives) Predictivity 165/173=95% Sensitivity 79% Specificity 72% Concordance 73% DEREK + statistical system Leadscope (public + internal Ames models) Ames Results pos. + neg. - DEREK + LS Prediction (0.2) mutagenic 34 (true positives) 87 (false positive prediction) Predictivity 28% not mutagenic 5 (false negative prediction) 143 (true negatives) Predictivity 143/148=97% Sensitivity 87% Specificity 62% Concordance 66% 33
Validation study for the in silico assessment of potential GTIs: Example of Sanofi validation When all approaches combined for the evaluation of potential mutagenic impurities Knowledge/rule-based system: DEREK Statistically-based system: Leadscope (public + internal Ames models) Expert knowledge: database search and data analysis. DEREK + LS + literature Ames Results pos. + neg. - mutagenic 37 (true positives) 87 (false positive prediction) not mutagenic 2 (false negative prediction) 143 (true negatives) Sensitivity 95% Specificity 62% Pos. predictivity 30% Neg. predictivity 143/145=99% Concordance 67% 34
Presentation Outline Principles of (Quantitative) Structure Activity Relationship and in silico / computational prediction of toxicity Application of in silico Toxicology in drug Use of in silico analysis for the evaluation of genotoxic impurities in drug Conclusion and Perspectives 35
Conclusion and Perspectives Use of in silico /computational toxicology Well established in drug Continuously improved (endpoints, chemical space, approaches) Need for robust datasets for the different endpoints and broad chemical space to build reliable models Data sharing initiatives would help In silico prediction of genotoxicity and mutagenicity Combination of different toxicity databases, prediction systems and expert knowledge High sensitivity and negative predictivity (for GTIs) Transparency needed (interpretation, de-risking rules, out of domain ) In silico Toxicology is expected to play an important role in the future possibly As data surrogate and for better understanding of mechanisms SAR combining chemical structures and pathways (off-target effects). 36
Merci pour votre attention 37
Résumé Approches structure-activité ou (Q)SAR pour l évaluation du risque toxicologique au cours du développement des médicaments Véronique Thybaud, Sanofi, Vitry-sur-Seine; Alexander Amberg, Sanofi, Francfort Les modèles évaluant les relations structure-activité sont utilisées tout au long du développement du médicament pour prédire les effets toxiques potentiels ; l hypothèse étant que des molécules présentant des similarités de structure possèdent potentiellement les mêmes propriétés toxicologiques. Les modèles structure-activité (SAR) de prédiction ou systèmes experts recherchent la présence de fragments ou toxicophores prédictifs d un effet toxique donné, soit sur la base de règles ou alertes structurales préétablies, soit à l aide de calculs statistiques qui comparent la molécule à évaluer avec des molécules dont les propriétés toxiques sont connues. Les approches QSAR, qui reposent sur l identification de descripteurs chimiques et l application de modèles mathématiques et informatiques, sont capables de prédire de façon quantitative l activité d une molécule inconnue. Au cours du processus de recherche, les modèles (Q)SAR sont intégrés dans le criblage et l optimisation des molécules candidates. Certains effets adverses (génotoxicité, reprotoxicité, cancérogenèse, irritation cutanée, hépatotoxicité, etc.) résultent de la formation d espèces réactives capables d interagir avec les macromolécules (ADN et protéines). C est pourquoi la réactivité des molécules est évaluée très en amont, afin d identifier les molécules porteuses d alertes structurales, de planifier au plus tôt leur évaluation dans les tests appropriés et d optimiser la série chimique. Une approche similaire de recherche de réactivité est appliquée lors de l évaluation des molécules manipulées par les travailleurs dans le cadre de l hygiène industrielle. D autres effets adverses découlent de l activité biologique des molécules, effets pharmacologiques exacerbées («on-target») ou effets secondaires («off-target»), et peuvent être reliés à la structure des molécules. Au cours du développement du médicament, les approches (Q)SAR, et in silico en général, participent à la compréhension des effets adverses observés chez l animal et éventuellement chez l homme. Jusqu à présent dans la plupart des cas, les hypothèses élaborées à partir des analyses (Q)SAR devaient être étayées par des données expérimentales ; ne pouvant à elles seules conduire à une prise de décision. Ce n est que très récemment que l absence d alerte structurale a été considérée suffisante pour conclure à l absence de risque, dans le cadre de l évaluation des impuretés génotoxiques (mutagènes) potentiellement présentes dans les médicaments (matières premières, réactifs, intermédiaires de synthèse, impuretés de synthèse, produits de dégradation, etc.). Tout d abord inclus dans une ligne directrice européenne, ce point est actuellement débattu dans le cadre du développement d une ligne directrice internationale (ICH M7). La prise en compte des résultats d une analyse in silico à des fins réglementaires requiert une excellente pertinence des prédictions ; en particulier l assurance que les molécules ne présentant par d alerte structurale sont bien dénuées, ici d activité mutagène. L utilisation de deux modèles reposant sur des principes différents, l un s appuyant sur des règles préétablies (tel que le modèle DEREK), et l autre s appuyant sur une approche statistique (tels que MultiCase ou Leasdscope) est actuellement suggérée. Les modèles de prédiction sont construits à partir des résultats obtenus avec des molécules préalablement testées (ici dans le test de Ames) qui déterminent le domaine d applicabilité (espace chimique et biologique couverts). Il est indispensable de s assurer que le domaine d applicabilité est pertinent pour la molécule à évaluer. Les résultats utilisés pour construire les modèles proviennent de la littérature (modèles publiques) ou de données internes (modèles internes). Dans les deux cas, la qualité des données aura été vérifiée. L'utilisation de modèles (Q)SAR pour l évaluation et la gestion du risque étant amenée à s'accroitre, en particulier pour des problèmes éthiques, il devient indispensable de veiller à la qualité des données utilisées pour construire les modèles, de partager ces données afin d élargir les domaines d applicabilité en particulier l espace chimique couvert et de prédire un plus grand nombre d effets adverses. 38