510 likes | 627 Views
Nanoinformatics: Advancing in silico Cancer Research. David E. Jones John D. Morgan Award Research partially supported by NLM Training Grant # T15LM007124. What is Nanotechnology?. The study of controlling and manipulating matter at the atomic or molecular level
E N D
Nanoinformatics: Advancing in silico Cancer Research David E. Jones John D. Morgan Award Research partially supported by NLM Training Grant #T15LM007124
What is Nanotechnology? • The study of controlling and manipulating matter at the atomic or molecular level • Focuses on the development of materials, devices, and other structures at the nanoscale • Very diverse field that bridges multiple sciences • Molecular Biology • Organic Chemistry • Molecular Physics • Material Science http://www.nanoinstitute.utah.edu/
Nanomedicine Defined • The medical application of nanotechnology used in the diagnosis, treatment, and prevention of diseases in the clinical setting
Science-to-Informatics Clinical Informatics Bioinformatics ?
Nanoinformatics • Defined in 2007 by the United States National Science Foundation • Improve research in the field of nanotechnology by using informatics techniques and tools on nanoparticle data and information http://www.nsf.gov/
Background: Nanoinformatics • National nanotechnology initiative • Enhance quality and availability of data • Data acquisition, analysis, and sharing • Expand theory, modeling, and simulation • Structural and predictive models • Informatics infrastructure • Semantic search and sharing of data/models • Web-enabled tools for collaboration http://www.nano.gov/node/681
Nanomedicine Areas of Focus In Vitro Detection Nanocarriers Theranostics http://www.nanotech-now.com http://www.universityofcalifornia.edu http://www.wikipedia.org/
Why are Nanocarriers so Important? • Nanomedicine delivery devices are important to the future of cancer treatment • Promising due to their properties • Suitable size, high solubility, and ability to change design Tanner P, et. al. Polymeric Vesicles: From Drug Carriers to Nanoreactors and Artificial Organelles. 2011.
Why are Nanocarriers so Important? • Enhanced permeability and retention (EPR) effect Park K. Polysaccharide-based near-infrared fluorescence nanoprobes for cancer diagnosis. 2012. http://krauthammerlab.med.yale.edu/imagefinder/Figure.external?sp=443431&state:Home=BrO0ABXcTAAAAAQAADHNlYXJjaFN0cmluZ3QAEG1pUiogYnJhaW4gaGVhcnQ%3D
Types of Nanocarriers Cho K, et. al. Therapeutic Nanoparticles for Drug Delivery in Cancer. 2008.
Poly(amido amine) Dendrimers • PAMAM dendrimers are particularly promising • Have potential for oral delivery • Cancer drugs can bind to the surface and interior of the molecule • Molecules surface can easily be modified http://www.dendritech.com
Design Challenges for Nanocarriers http://bioserv.rpbs.univ-paris-diderot.fr/services/FAF-Drugs/admetox.html
For Small Molecule Pharmaceutics • Well known in silicoapproaches exist • Quantitative Structure Activity Relationships (QSAR) • Analyze the structures and functions of pharmaceutical and chemical compounds • Used for many different bioactive molecules in the fields of medicinal chemistry and cheminformatics • This method has seen limited application in the ability to empirically calculate biochemical properties of nanoparticles
Nanoinformatics Challenges • These approaches have not been used in nanocarriers for many reasons • Availability of nanoparticle data • Actual atomic size of the nanoparticle structures • Computational capability and algorithms http://www.nanoinstitute.utah.edu/
Ultimate Goal of this Research • Demonstrate that in silico aided design of nanocarriers is possible by developing and adapting advanced informatics techniques • Utilize state of the art data mining and machine learning techniques to develop a model linking PAMAM dendrimer cytotoxicity to molecular descriptors and structure of the nanoparticle
Where Do We Start? • Availability of Nanoparticle Data • Databases containing information relevant to biomedical nanoparticles are critical for secondary uses such as data mining and predictive modeling
caNanoLab • Database containing information relevant to nanomedicine on nanoparticles and their properties • Developed by the National Cancer Institute for sharing nanoparticle information https://cananolab.nci.nih.gov/caNanoLab/
caNanoLab • Issues • Limited number of nanoparticles (not all inclusive or current) • Incomplete information regarding the chemical and physical properties of nanoparticles • No simple way to download the data to apply machine learning or statistical analyses • There is no ability to query this system and no data model exists to compare the properties of the molecule to its biochemical activity
Data Not Easily Accessible • Availability of nanoparticle data • To our knowledge, there is no authoritative, up-to-date database • Manual extraction is not feasible
Natural Language Processing (NLP) • Information extraction method • Used to automatically extract information from an unstructured (free-text) document • Shown to be successful in extracting information from related biomedical fields http://www.conversational-technologies.com/nldemos/nlDemos.html
Nano-NLP • Garcia-Remesal, Maojo, and colleagues • Text classificationmethod • Identified: • Nanoparticle names • Routes of exposure • Toxic effects • Particle targets • Successful, but qualitative not quantitative
Our Approach • Two-Step process Text Classification Text Extraction
Text Extraction Purpose • Extract numeric values associated with PAMAM dendrimer properties from the cancer nanomedicine literature • NanoSifter • 10 properties taken from the NanoParticle Ontology (NPO) • Hydrodynamic diameter, particle diameter, molecular weight, zeta potential, cytotoxicity, IC50, cell viability, encapsulation efficiency, loading efficiency, and transfection efficiency Jones DE, Igo S, Hurdle J, Facelli JC. AutomaticExtraction of NanoparticlePropertiesUsing Natural LanguageProcessing: NanoSifter anApplication to Acquire PAMAM Dendrimer Properties. PloSone. 2014;9(1):e83932. Epub 2014/01/07.
NanoSifter Observations • Recall vs. precision • Desire a higher recall because this means that we are capturing most instances (i.e. missing very few in the literature) • Tradeoff is that the number of false positives increases which in turn reduces the precision
NanoSifter Limitations • Data extracted by our method is not always directly associated with a dendrimer nanoparticle • Only pair a nanoparticle property term with a single numeric value annotation before and after itself (co-reference resolution) • Cannot extract data from tables and figures
NanoSifter Discussion • Next steps • Continue work on text classification methods to improve the precision of the system • Expand the property terms and numeric values that the system targets • Annotate and extract information from other subclasses of nanoparticles • Implement some sort of negation analysis tool into our system
Text Classification Purpose • Identify and annotate entities in the unstructured nanomedicine literature • Augment the text extraction method • Improve the precision of extracted property data
Now Have the Necessary Data… • Data mining and predictive modeling • Previous studies • Liu et al. analyzed a number of attributes of a variety of nanoparticles in order to predict post-fertilization mortality in zebrafish • Horev-Azaria and colleagues used predictive modeling to explore the effect of cobalt-ferrite nanoparticles on the viability of seven different cell lines • This method has not been applied to empirically calculate a prediction of the cytotoxicity of PAMAM dendrimers
In Silico Platform Jones DE, Hamidreza Ghandehari, Facelli JC. Data Mining in Nanomedicine: Predicting Toxicity of PAMAM Dendrimers by Molecular Descriptors and Structure. Submitted 2014.
PAMAM Dendrimers G4 G3
Classification Analysis • Initial analysis
Classification Analysis • Feature selection analysis
Discussion • Greatest prediction accuracies were achieved after supplementing the expert selected features with experimental conditions • The properties presented in the decision tree diagram represent the more general properties of charge, size, and concentration • Experimentally, these properties have been hypothesized to be primary causes of cytotoxicity
Conclusion • The results indicate that data mining and machine learning can be used to predict cytotoxicity and cell viability of PAMAM dendrimers on Caco-2 cells with good accuracy • Nanoinformatics methods could be implemented to significantly reduce the search space necessary to create suitable PAMAM dendrimers which exhibit less cytotoxicity
References 1. Jain K. TheHandbook of Nanomedicine. 1st ed. Totowa, New Jersey: Humana; 2008. 2. Staggers N, McCasky T, Brazelton N, Kennedy R. Nanotechnology: thecomingrevolution and itsimplicationsforconsumers, clinicians, and informatics. Nursingoutlook. 2008;56(5):268-74. Epub 2008/10/17. 3. de la Iglesia D, Maojo V, Chiesa S, Martin-Sanchez F, Kern J, Potamias G, et al. International efforts in nanoinformatics researchapplied to nanomedicine. Methods of information in medicine. 2011;50(1):84-95. Epub 2010/11/19. 4. Thomas DG, Pappu RV, Baker NA. NanoParticle Ontologyforcancernanotechnologyresearch. J Biomed Inform. 2011;44(1):59-74. Epub 2010/03/10. 5. NationalCancerInstitute. caNanoLab. 2011 [cited 2011]; Welcome to thecancerNanotechnologyLaboratory (caNanoLab) portal. caNanoLab is a data sharing portal designed to facilitateinformationsharing in thebiomedicalnanotechnologyresearchcommunity to expedite and validatethe use of nanotechnology in biomedicine. caNanoLab providessupportfortheannotation of nanomaterials withcharacterizationsresultingfromphysico-chemical and in vitro assays and thesharing of thesecharacterizations and associatednanotechnologyprotocols in a securefashion.]. Availablefrom: https://cananolab.nci.nih.gov/caNanoLab/. 6. Hunter L, Lu Z, Firby J, Baumgartner WA, Jr., Johnson HL, Ogren PV, et al. OpenDMAP: an open source, ontology-driven concept analysisengine, withapplications to capturingknowledgeregardingproteintransport, proteininteractions and cell-type-specific gene expression. BMC bioinformatics. 2008;9:78. Epub 2008/02/02. 7. Garcia-Remesal M, Garcia-Ruiz A, Perez-Rey D, de la Iglesia D, Maojo V. Using nanoinformatics methodsforautomaticallyidentifyingrelevantnanotoxicologyentitiesfromtheliterature. BioMedresearchinternational. 2013;2013:410294. Epub 2013/03/20. 8. Cunningham H, al. e. Text Processingwith GATE: University of Sheffield Department of ComputerScience; 2011. 9. Yang Y. AnEvaluation of StatisticalApproaches to Text Categorization. InformationRetrieval. 1999;1(1-2):69-90. 10. Tropsha A, Golbraikh A. Predictive QSAR modelingworkflow, modelapplicabilitydomains, and virtual screening. Currentpharmaceuticaldesign. 2007;13(34):3494-504. Epub 2008/01/29. 11. Liu X, Tang K, Harper S, Harper B, Steevens JA, Xu R. Predictivemodeling of nanomaterialexposureeffects in biologicalsystems. International journal of nanomedicine. 2013;8 Suppl 1:31-43. Epub 2013/10/08. 12. Horev-Azaria L, Baldi G, Beno D, Bonacchi D, Golla-Schindler U, Kirkpatrick JC, et al. Predictivetoxicology of cobaltferritenanoparticles: comparative in-vitro study of differentcellularmodelsusingmethods of knowledgediscoveryfrom data. Particle and fibretoxicology. 2013;10:32. Epub 2013/07/31. 13. ChemAxon, Berry I, Ruyts B. Future-proofing Cheminformatics Platforms2012 10/31/2013:[1-16 pp.]. Availablefrom: http://www.chemaxon.com/wp-content/uploads/2012/04/Future_proofing_cheminformatics_platforms.pdf. 14. Ltd. C. Marvin. 2013. 15. Witten I, Frank E, Hall M. Data Mining: Practical Machine Learning Tools and Techniques. 3 ed: Morgan KaufmannPublishers; 2011. 629 p. 16. Vasumathi V, Maiti PK. Complexation of siRNA with Dendrimer: A Molecular ModelingApproach. Macromolecules. 2010;43:8264-74. 17. Karatasos K, Posocco P, Laurini E, Pricl S. Poly(amidoamine)-based dendrimer/siRNA complexationstudiedbycomputersimulations: effects of pH and generationon dendrimer structure and siRNA binding. Macromolecular bioscience. 2012;12(2):225-40. Epub 2011/12/08.
Acknowledgements • Morgan Family • National Library of Medicine Training Grant • Department of Biomedical Informatics at the University of Utah • Ph.D. Committee • Julio C. Facelli, Ph.D. • Hamidreza S. Ghandehari, Ph.D. • John F. Hurdle, M.D., Ph.D. • Karen Eilbeck, Ph.D. • Bruce E. Bray, M.D.