500 likes | 662 Views
Screen Ligand based virtual screening presented by … maintained by Miklós Vargyas . Last update: 13 April 2010. Screen. Virtual screening by topological descriptors . Screen. Description of the product.
E N D
ScreenLigand based virtual screeningpresented by …maintained by Miklós Vargyas Last update: 13 April 2010
Screen Virtual screening by topological descriptors
Screen Description of the product Screen performs high throughput virtual screening of compound libraries using similarity comparisons by various molecular descriptors. Availabilty • JChemBase • JChem Oracle cartridge • Instant Jchem • Server version • standalone command line application programs • KNIME • PipelinePilot
Key features • Various 2D descriptors • ChemAxon chemical fingerprint (CCFP) • PipelinePilot ECFP/FCFP • ChemAxon pharmacophore fingerprint (CPFP) • BCUT • Scalars (logP, logD, Szeged index …) • custom descriptors, in-house fingerprints • Optimized similarity measures • Improves similarity prediction • depends on set of known actives • high enrichment ratios in virtual screening • Multiple queries • 3 types of hypotheses • combined hit lists
Benefits • Versatile • Use various descriptors in your well established model • Access your trusted in-house fingerprint in IJC, JCB, JCART • Easy integration in corporate discovery pipelines • Search chemical files directly no need to import structures in database • New descriptors are pluggable in deployed systems • Optimal • Consistent similarity scores • Smaller hit set • More focused library
Benefits 0.57 0.55 0.47 More consistent similarity scores optimized Tanimoto 0.20 regular Tanimoto 0.06 0.28
Benefits High enrichment ratio • Fewer false hits • Known actives are true positive hits (ACE inhibitors)
Results NPY-5 (pharmacophore similarity)
Results β2-adrenoceptor (pharmacophore similarity)
Case study at Axovan • GPCR activity prediction • distinguishing between GPCR subclasses GPCR-Tailored Pharmacophore Pattern Recognition of Small Molecular Ligands Modest von Korff and Matthias Steger, JCICS2004, 44
Screen roadmap • New molecular descriptors • ECFP/FCFP (in 5.4) • Shape descriptors (in 5.4) • Hidden use of the optimiser • No-pain black-box approach • Simultaneous multi-descriptor search • Enhanced IJC integration • Easy descriptor configuration and generation • Similarity search type instead of descriptors, metrics and other unfriendly concepts
Screen roadmap • GUI • New web interface (HTML/AJAX) • Desktop application for descriptor generation • 3D shape similarity • fast pre-filtering by 3D fingerprint • Alignment based volumetric Tanimoto calculation • scaffold hopping by maximizing topological dissimilarity and spatial similarity
A typical approach 0101010100010100010100100000000000010010000010010100100100010000 query fingerprint query metric 0000000100001101000000101010000000000110000010000100001000001000 0100010110010010010110011010011100111101000000110000000110001000 0100010100011101010000110000101000010011000010100000000100100000 0001101110011101111110100000100010000110110110000000100110100000 0100010100110100010000000010000000010010000000100100001000101000 0100011100011101000100001011101100110110010010001101001100001000 0101110100110101010111111000010000011111100010000100001000101000 0100010100111101010000100010000000010010000010100100001000101000 0001000100010100010100100000000000001010000010000100000100000000 0100010100010011000000000000000000010100000010000000000000000000 0100010100010100000000000000101000010010000000000100000000000000 0101010101111100111110100000000000011010100011100100001100101000 0100010100011000010000011000000000010001000000110000000001100000 0000000100000000010000100000000000001010100000000100000100100000 0100010100010100000000100000000000010000000000000100001000011000 0001000100001100010010100000010100101011100010000100001000101000 0100011100010100010000100001001110010010000010001100000000101000 0101010100010100010100100000000000010010000010010100100100010000 hits targets target fingerprints
ChemAxon’s approach 0100010100011101010000110000101000010011000010100000000100100000 0001101110011101111110100000100010000110110110000000100110100000 0100010100110100010000000010000000010010000000100100001000101000 0101110100110101010111111000010000011111100010000100001000101000 0001000100010100010100100000000000001010000010000100000100000000 0100010100010100000000000000101000010010000000000100000000000000 0101010101111100111110100000000000011010100011100100001100101000 0100010100011000010000011000000000010001000000110000000001100000 0000000100000000010000100000000000001010100000000100000100100000 0101110100110101010111111000010000011111100010000100001000101000 hypothesis fingerprint queries optimized metric optimization 0000000100001101000000101010000000000110000010000100001000001000 0100010110010010010110011010011100111101000000110000000110001000 0100010100011101010000110000101000010011000010100000000100100000 0001101110011101111110100000100010000110110110000000100110100000 0100010100110100010000000010000000010010000000100100001000101000 0100011100011101000100001011101100110110010010001101001100001000 0101110100110101010111111000010000011111100010000100001000101000 0100010100111101010000100010000000010010000010100100001000101000 0001000100010100010100100000000000001010000010000100000100000000 0100010100010011000000000000000000010100000010000000000000000000 0100010100010100000000000000101000010010000000000100000000000000 0101010101111100111110100000000000011010100011100100001100101000 0100010100011000010000011000000000010001000000110000000001100000 0000000100000000010000100000000000001010100000000100000100100000 0100010100010100000000100000000000010000000000000100001000011000 0001000100001100010010100000010100101011100010000100001000101000 0100011100010100010000100001001110010010000010001100000000101000 0101010100010100010100100000000000010010000010010100100100010000 hits targets target fingerprints
Performance Chemical fingerprint generation: 500/s Pharmacophore fingerprint generation • calculated: 80/s • rule-based: 200/s Screening: 12000/s Optimization: 10s/metric Hardware/software environment: • P4 3GHz, 1GB RAM • Red Hat Linux 9 • Java 1.4.2
Implementations Use of various fingerprints and metrics in JSP http://www.chemaxon.com/jchem/examples/jsp1_x/index.jsp UGM presentation by Aureus Pharma Improved Virtual Screening Strategies and Enrichment of Focused Libraries in Active Compounds Using Target-Oriented Databases http://www.chemaxon.com/forum/viewpost2307.html
Molecular similarity Chemical, pharmacological or biological properties of two compounds match. The more the common features, the higher the similarity between two molecules. Chemical Pharmacophore
Similarity measures Quantitative assessment of similarity of structures • need a numerically tractable form • molecular descriptors, fingerprints, structural keys Sequences/vectors of bits, or numeric values that can be compared by distance functions, similarity metrics.
( ) , = 0.68 ( ) , = 21.93 Standard metrics
Topological chemical fingerprint hashed binary fingerprint • encodes topological properties of the chemical graph: connectivity, edge label (bond type), node label (atom type) • allows the comparison of two molecules with respect to their chemical structure Construction find all 0, 1, …, n step walks in the chemical graph generate a bit array for each walks with given number of bits set merge the bit arrays with logical OR operation
H H H C C O H H H Construction of chemical fingerprint
Chemical similarity 0100010100010100010000000001101010011010100000010100000000100000 0100010100010100010000000001101010011010100000000100000000100000
Topological pharmacophore fingreprint • encodes pharmacophore properties of molecules as frequency counts of pharmacophore point pairs at given topological distance • allows the comparison of two molecules with respect to their pharmacophore Construction perceive pharmacophoric features map pharmacophore point type to atoms calculate length of shortest path between each pair of atoms assign a histogram to every pharmacophore point pairs and count the frequency of the pair with respect to its distance
acceptor donor donor Pharmacophore perception Rule based approach Rule 1: The pharmacophore type of an atom is an acceptor, if • it is a nitrogen, oxygen or sulfur, and • it is not an amide nitrogen or sulfur, and • it is not an aniline nitrogen, and • it is not a sulfonyl sulfur, and • it is not a nitro group nitrogen.
donor Exceptions to simple rules n-cyano-methil piperidine sp2 atom exception extra rules large number of rules maintenance, performance
acceptor donor Effect of pH pH = 7 pH = 1 pH pH specific rules large number of rules maintenance, performance
Pharmacophore perception Calculation based approach Step 1: estimation of pKa allows the determination of the protonation state for ionizable groups at the given pH Step 2: partial charge calculation
Pharmacophore perception Calculation based approach Step 3: hydrogen bond donor/acceptor recognition Step 4: aromatic perception Step 5: pharmacophore property assignment acceptor negatively charged acceptor acceptor and donor hydrophobic none
Pharmacophore fingerprint Pharmacophore type coloring:acceptor, donor, hydrophobic, none.
2 DE=1.41 DE=0.45 1 1 0 0 AA1 AA2 AA3 AA4 AA5 AA6 AA1 AA2 AA3 AA4 AA5 AA6 2 2 1 1 0 0 AA1 AA2 AA3 AA4 AA5 AA6 AA1 AA2 AA3 AA4 AA5 AA6 Fuzzy smoothing
Virtual screening using fingerprints 0101010100010100010100100000000000010010000010010100100100010000 query fingerprint query metric 0000000100001101000000101010000000000110000010000100001000001000 0100010110010010010110011010011100111101000000110000000110001000 0100010100011101010000110000101000010011000010100000000100100000 0001101110011101111110100000100010000110110110000000100110100000 0100010100110100010000000010000000010010000000100100001000101000 0100011100011101000100001011101100110110010010001101001100001000 0101110100110101010111111000010000011111100010000100001000101000 0100010100111101010000100010000000010010000010100100001000101000 0001000100010100010100100000000000001010000010000100000100000000 0100010100010011000000000000000000010100000010000000000000000000 0100010100010100000000000000101000010010000000000100000000000000 0101010101111100111110100000000000011010100011100100001100101000 0100010100011000010000011000000000010001000000110000000001100000 0000000100000000010000100000000000001010100000000100000100100000 0100010100010100000000100000000000010000000000000100001000011000 0001000100001100010010100000010100101011100010000100001000101000 0100011100010100010000100001001110010010000010001100000000101000 0101010100010100010100100000000000010010000010010100100100010000 hits targets target fingerprints
Multiple query structures 0100010100011101010000110000101000010011000010100000000100100000 0001101110011101111110100000100010000110110110000000100110100000 0100010100110100010000000010000000010010000000100100001000101000 0101110100110101010111111000010000011111100010000100001000101000 0001000100010100010100100000000000001010000010000100000100000000 0100010100010100000000000000101000010010000000000100000000000000 0101010101111100111110100000000000011010100011100100001100101000 0100010100011000010000011000000000010001000000110000000001100000 0000000100000000010000100000000000001010100000000100000100100000 0101110100110101010111111000010000011111100010000100001000101000 queries hypothesis fingerprint metric 0000000100001101000000101010000000000110000010000100001000001000 0100010110010010010110011010011100111101000000110000000110001000 0100010100011101010000110000101000010011000010100000000100100000 0001101110011101111110100000100010000110110110000000100110100000 0100010100110100010000000010000000010010000000100100001000101000 0100011100011101000100001011101100110110010010001101001100001000 0101110100110101010111111000010000011111100010000100001000101000 0100010100111101010000100010000000010010000010100100001000101000 0001000100010100010100100000000000001010000010000100000100000000 0100010100010011000000000000000000010100000010000000000000000000 0100010100010100000000000000101000010010000000000100000000000000 0101010101111100111110100000000000011010100011100100001100101000 0100010100011000010000011000000000010001000000110000000001100000 0000000100000000010000100000000000001010100000000100000100100000 0100010100010100000000100000000000010000000000000100001000011000 0001000100001100010010100000010100101011100010000100001000101000 0100011100010100010000100001001110010010000010001100000000101000 0101010100010100010100100000000000010010000010010100100100010000 hits targets target fingerprints
Hypothesis fingerprints Advantages • allows faster operation • compiles features common to each individual actives • reduces noise Hypothesis types
The need for optimization Too many hits
0.57 0.55 0.47 The need for optimization Inconsistent dissimilarity values
asymmetry factor weights Parametrized metrics asymmetry factor scaling factor
training set training set query set known actives selected targets test set test set Optimization of metrics Step 1 optimize parameters for maximum enrichment Step 2 validate metrics over an independent test set
training set Optimization of metrics Step 1 optimize parameters for maximum enrichment query set 1111100010000100001000101000 query fingerprint parametrized metric
potential variable value temporarily fixed value final value running variable value Optimization of metrics v1 v2 v3 vi vn
test set Optimization of metrics Step 2 validate metrics over an independent test set query set 1111100010000100001000101000 optimized metric query fingerprint
0.57 0.55 0.47 Results of Optimization 1. Similar structures get closer 0.20 0.06 0.28
Results of Optimization 2. Hit set size reduced Active set: 18 mGlu-R1 antagonists Target set: 10000 randomly selected drug-like structures
Results of Optimization 3. Higher enrichment
Results of Optimization 4. Top ranked structures are spikes • offers a more intuitive way to evaluate the efficiency of screening • based on sorting random set hits and known actives on dissimilarity values and counting the number of random set hits preceding each active in the sorted list 0.014 0.015 0.017 0.020 0.022 0.023 0.027 0.041 0.043 number of virtual hits number of spikes retrieved
Results ACE (pharmacophore similarity)
Results NPY-5 (pharmacophore similarity)
Results β2-adrenoceptor (pharmacophore similarity)
3D flexible search Expected top performance 200 structures/s