220 likes | 312 Views
Scientific & technical presentation JChem Base. version 5.3, February 20 10. Introduction to JChem Base . High performance Java based tools for: storage , search and retrieval of chemical structures and associated data The components can be integrated into
E N D
Scientific & technical presentation JChem Base version 5.3, February 2010
Introduction to JChem Base High performance Java based tools for: storage, search and retrieval of chemicalstructures and associated data The components can be integrated into web-based or standalone applications in association with other ChemAxon tools
Structural overview Web browser Web application Application JChem Base API: Chemical logic Structure cache JDBC driver: Standard interface to the RDBMS RDBMS (e.g. Oracle, MySQL, etc.) : Storage and security
Compatibility and integration File formats: • SMILES • MDL molfile (v2000 and v3000) • MDL SDF • RXN • RDF • MRV • IUPAC name, InChI • Markush DARC • CDX Integration: • extensive API for • Java • .NET • JChem Cartridge forOracle Database engines: • Oracle • MySQL • MS SQL Server • PostgreSQL • MS Access • IBM DB2 • Derby • etc. Operating systems: • Windows • Linux • Mac OS X • Solaris • etc.
JSP example application • Features: • Substructure, Superstructure, Full, Exact fragment, Similarity and Perfect search • Molecular Descriptor similarity search with descriptor coloring • Substructure hit alignment and coloring, inverse hit list • Chemical Terms filter • Import / Export • Export of hits • Insert / Modify / Delete structures • AJAX in JChem Webservices
Structure search features • Wide range of query atoms • Query properties • R-group queries • Full SMARTS support • Coordination compounds • Link nodes • Pseudo atoms, lone pairs • Relative stereo • Reaction search features • Hit coloring, position variation • Polymers See detailed information on structure search: www.chemaxon.com/conf/Structural_Search.ppt
Search options Some selected structure search options: Stereo on/off Ignore charge/isotope/radical/valence/polymers, etc. Vague bond matching options Chemical Terms filter Tautomer search Inverse hit list Maximum search time / number of hits Combine with non-structure conditions Ordering of results etc.
Performance (1) Compound registration: Substructure searchin PubChem(19.5 million compounds): JChem Base 5.2.2, Intel Quad Q6600 2.4GHz, 8 GB RAM; Oracle 10.2.0.3
Performance (2) Similarity search:Tanimoto >0.9 JChem Base 5.2.2, Intel Quad Q6600 2.4GHz, 8 GB RAM;Oracle 10.2.0.
Markush structures • Markush structure registration and search • Markush features • R-groups • Atom lists, bond lists • Position variation bond • Link nodes and repeating units • Homology variation (alkyl, aryl, etc.) • Compatible Markush enumeration plugin
Administration with JChemManager User interface for • creating tables • import • export • deleting rows • dropping tables Most functions are also available from commandline.
Standardization • Default standardization includes: • Hydrogen removal • Aromatization • Custom standardization can be specified for each table by specifying an XML configuration file at table creation or in the “Table Options” dialog of JChem Manager (jcman)
before after Standardizerhttp://www.chemaxon.com/conf/Standardizer.ppt Custom Standardization Example
The property table The property table stores information about JChem structure tables, including: • Fingerprint parameters • Custom standardization rules • Other table options and information More than one property table can be used, each property table represents a particular JChem environment.
Table types Control allowed chemical structures and available operations • Molecule • Reaction • Markush • Query • Any structure
Structural search in database Two stage method provides optimal performance: • Rapid pre-screening reduces the number of possible hit candidates • Chemical Hashed Fingerprints are used for substructure and superstructure searches • Hash code is used for duplicate filtering (usually during compound registration) • Graph search algorithm is used to determine the final hit list
Structure Cache • Contains Fingerprints for screening and ChemAxon Extended SMILES for ABAS • Instant access to the structures for the search process • Reduced load on the database server • Incremental update ensures minimum overhead after changes in the table • Small memory footprint due to • SMILES compression • Optimized storage technique • Approximately 100MB memory needed for 1 million typical drug-like structures (using default, 512 bit long fingerprints)
Future plans • Graphical user interface for R-group decomposition • Arbitrary table structure (Java and .NET API for JChem index) • Maximum common substructure search type • Additional layer: JChem Server (later also as grid) • Compound registration system API
Summary ChemAxon’s JChem Base API provides sophisticated high performance tools for the developer to deal with chemical structures and associated data. Building on the JChem API is convenient, because: • Our various tools integrate seamlessly • Both high and low level API classes are available • Responsive developer-to-developer support
Links • JChem home page: • http://www.chemaxon.com/products/jchem-base • Online tryout: • http://www.chemaxon.com/jchem/examples.html • API documentation: • http://www.chemaxon.com/jchem/doc/api/index.html • Brochure: • www.chemaxon.com/brochures/JChemBase.pdf
Visit othertechnical presentations MarvinSketch/Viewhttp://www.chemaxon.com/MarvinSketch_View.ppt MarvinSpacehttp://www.chemaxon.com/MarvinSpace.ppt Calculator Pluginshttp://www.chemaxon.com/Calculator_Plugins.ppt JChem Basehttp://www.chemaxon.com/JChem_Base.ppt JChem Cartridgehttp://www.chemaxon.com/JChem_Cartridge.ppt Standardizerhttp://www.chemaxon.com/Standardizer.ppt Screenhttp://www.chemaxon.com/Screen.ppt JKlustorhttp://www.chemaxon.com/JKlustor.ppt Fragmenterhttp://www.chemaxon.com/Fragmenter.ppt Reactor http://www.chemaxon.com/Reactor.ppt