680 likes | 692 Views
The Elucidation of Regulatory Networks in Complex Biological Systems: The Convergence of Biology, Medicine and Computing. G. Poste Stanford University, 15 March 2002 gposte@healthtechnetwork.com. biology. computing. genomics. The Analysis and Application of Principles of Biological Design.
E N D
The Elucidation of Regulatory Networksin Complex Biological Systems:The ConvergenceofBiology, Medicine and Computing G. Poste Stanford University, 15 March 2002 gposte@healthtechnetwork.com
biology computing genomics The Analysis and Application of Principles of Biological Design • the descriptive narrative biology • empirical technology chemistry 1750-1980 • mechanistic reductionism 1980-2010 • mapping the basis of biological variation • the encoded information content of biological systems • rational medicine and customized care systems biology
Biology and Medicine as Information-Based Sciences From Reductionism to Integrated Systems Biology • individual genes • and proteins • biological circuits, • pathways and networks • molecular interactions • in simple systems • assembly of higher • order systems • limited, fragmented • datasets • massive, integrated • datasheets • stringent, standardised • annotation • poor annotation • robust algorithms for • predictive biology • limited capacity for • predictive simulation • biology in silico • analog information • digital information
21st Century Biology and Medicine • “SYSTEMS BIOLOGY” • the design principles of biological order and complexity • mapping the information content of biopathways and networks Biotechnology And Systems Biology New Analytical Capabilities Large Scale Computing • “BIG BIOLOGY” • interdisciplinary, massive datasets, information-based • infrastructure, investment and education
Convergence :The Technological Platforms Shapingthe Evolution of Healthcare Automation Engineering and Robotics Rule-Based Design Principles Biotechnology And Systems Biology New Analytical Capabilities Computational Biology Materials Science Exploring “Biospace” Micro-/Opto- Electronics Large Scale Computing
From Reductionism to Integrated Systems Biology • understanding the information content encoded in biological networks • mapping the design rules for progressively greater complexity of biological order gene(s) pathways, circuits and networks progressively ordered assemblies: organelles, cells, tissues organs homeostatic integration of myriad, complex, interactive networks (Physiology)
High Level Abstraction of Biological Pathways and Network Systems Encoded Information Pathways and Networks Rule Sets Plasticity • adaptive fitness • pathological peturbation Predictive Biology • directed evolution • biology in silico Novel Biospace and Carbon : Silicon Union
Global and Nodal Pathway Map of Genomic and Proteomic Elements in Yeast Galactose Utilization From: T. Ideker et. al. 2001. Science 292, 929
Genetic Networks • bioinformation processing involves leverage of interactive feedback loops in diverse domains • physical, chemical, electrical • genomic and proteomic codes represent a dense network of nested hyperlinks • matter becomes code
Nonlinear Complexity in Biological Systems • distinct classes of nonlinear interactions • long-range (fractal) correlations • self-similarity, self-dissimilar and organized criticality • pattern formation • complex adaptive networks • highly optimized tolerance = robustness with fragility • barriers to cascading failures • deterministic chaos • emergent properties
Nonlinear Complexity in Biological Systems • abrupt changes • bifurcations; intermittency/bursting; bistability/multistability; phase transitions • nonlinear oscillations • limit cycles; phase-resetting; entrainment • nonlinear waves • spirals; scrolls; solitons • complex periodic cycles and quasiperiodicities • scale invariance • fractal and multifractal scaling; long-range correlations; self-organized criticality • stochastic resonance and related noise-modulated mechanisms • time irreversibility
Information and Technology Platform Overload
Principal Themes in theAnalysis of Biological Systems • large scale • miniaturization • automation • parallelism • networked systems • real time, interactive, adaptive
Major Technology Gaps • rapid gene ID in complex genomes • structural genomics and protein structure-function prediction • mapping the proteome • abundance, modification, localisation and protein-protein interactions • large scale parallelism (protein-arrays) • small organic molecule networks • mapping the metabolome • circuits, modules, networks • robust predictive algorithms for ADMET profiling of drug candidate SAR
The Need for Standards and Stringent Semantics “... without which ….. wanton and luxuriant fancies climbing up into the Bed of Reason, do not only defile it by unchaste and illegitimate embraces, but instead of real conceptions and notices of things do impregnate the mind with nothing but Ayerie and Subventaneous Phantasmes” Samuel Parker, FRS 1666
standards standards STANDARDS
The Analysis and Comprehension of Biological Systems initial mechanistic insights descriptive ignorance burgeoning, bewildering complexity • elucidation of • patterns • defining rule • sets • elegant simplicity • revealed • predictive biology complexity • right Rx : right disease • right Rx : right patient • from reactive treatment • to proactive prevention • disease heterogeneity • patient heterogeneity • disease predisposition defined rule sets
molecular phylogenies and geneology biological order Integrated Distributed Heterogeneous Databases and Databanks chemical SAR population genetics clinical databanks
data warehousing and data mining object-oriented and pattern / spatial array recognition Expert Systems and Knowledge Management evolving hardware and electronic evolution human- computer interface systems
Convergence, Consilience, Cognition and Computing • more science • better science • faster science • cross-disciplinary • science • interdisciplinary • convergence • technological • convergence • corporate • convergence MEGADATA
The Scalability Crisis • burgeoning data volumes • more transactions • increasing diversity of • datasets/apps • expanding user • communities Volume • pressures on network • bandwidth • complexity of • distributed environments • rising performance • expectations • confidentiality and privacy Performance
Major Challenges for Life Sciences Computing • exponentially growing data repositories (102TB/PB) • highly variable data formats and standards as obstacles to data access and mining • inadequate attention to data Q.C./annotation standards • excessive reliance on customized solutions and fragmented data sources • inadequate access and integration of public and private datasets • primitive data visualization tools • 80% time spent on data preparation tasks and 20% on productive exploration
Major Challenges for Life Sciences Computing Big Biology • infrastructure scale and capital investment • new tools for mining, visualization, simulation • data storage conventions and technologies • dynamic, adaptive, scalable systems • active networks • software into the network • subnet interoperability • integration of distributed and collaborative working environments • fast data access at all levels • storage, I/O and networks to support analysis and simulation • expanded bandwidth for high usage and high transfer rates
Bracing For the Inevitable : Petabyte-Size Databases • 1000 terabytes • 250 billion text pages • 20 million four drawer filing cabinets • 2000 mile high tower of 1 billion diskettes • typical US consumer generates 100 Gbytes personal data/lifetime • education, insurance, credit, medical • 100 million consumers 10,000 petabytes
Data Grids • from Napster and Gnutella to • ubiquitous peer-to-peer exchange of data sets to • apportioned distributed computing for solutions of computationally massive problems
Informatics for Big Biology and e.Health Networks • instructive precedents in high end computing from other disciplines • cosmology, quantum chromodynamics, • climate research, materials Europe USA • UNICORE • Pangea • E-Science • LHC Challenge • E-Grid • Scientific Simulation Initiative • National Computational • Science Alliance • Long Term Ecological Research • NASA, DOE, NOAA • Accelerated Strategic Computing Initiative • Grid Physics Network
The Bibliome The Global Virtual Archive/ Universal Knowledge Web Proof, logic and ontology languages • shared terms/ terminology • machine-machine • communication • inter-memetic translation • self-evolving translators • Metadata tagging standards • for interoperable distributed • archives • self-assembling datasets • self-describing documents • Resource Description • Framework • eXtensible Markup • Language Metadata • The first generation • Web • HyperText Markup Language • HyperText Transfer Protocol WWW I Modified from : T. Berners-Lee and J. Hendler Nature 2000 410, 1023
Standardized Lexical Foundations for the Annotation, Archiving and Analysis of Complex Biological Systems • unique complexity of biological systems • multiple levels of abstraction • organismal • ecosystem dynamics • social/memetic networks • qualitative not quantitative data • diversity of experimental conditions • inaccessibility/replication of experimental conditions • upgrading to hybrid qualitative/quantitative analysis tools
Standardized Lexical Foundations for the Annotation, Archiving and Analysis of Complex Biological Systems • entity classes : finite elements • action properties : state properties • intramolecular site interactions • intermolecular site interactions • massively parallel networks : unit modules • continuum systems • compartments • economy and parsimony • evolutionary relationships • network pathways • redundancy (degeneracy), pleiotropy • complex emergent properties
Standardized Lexical Foundations for the Annotation, Archiving and Analysis of Complex Biological Systems • entity classes : finite elements • action properties : state properties • intramolecular site interactions • intermolecular site interactions • massively parallel networks : unit modules • continuum systems • compartments • economy and parsimony • evolutionary relationships • network pathways • redundancy (degeneracy), pleiotropy • complex emergent properties • submodels for searchable characteristics of functional knowledge • integration of submodels into web-based distributed model networks
Jabberwocky “ ’Twas brillig and the slithy toves Did gyre and gimble in the wabe; All mimsy were the borogoves And the mome raths outgrabe” Lewis Carroll
The Divide Between Syntax and Semantics “Colorless ideas sleep furiously” Noam Chomsky (1957) • syntactically valid • semantically void
The Divide Between Syntax and Semantics • “Colorless green ideas sleep furiously” • Noam Chomsky (1957) • encoded genome structure (syntax) and diverse expression repertoires (semantics) • alternative splicing • overlapping reading frames • nonsense mutations • differential modulation by different transcription factors • database formats (syntax) and ontology (semantics)
The Conceptual Complexity of Ontology Design • ontology • set of axioms in a logical language • representational vocabulary with precise definitions of shared understanding • axioms constrain interpretation of defined terms • XML versus ontology and evolution of the semantic web • XML less complex since semantics are not represented • objective to reduce uncertainty favors ontologies • objectives to reduce complexity favors XML
Convergence, Consilience, Cognition and Computing scientific, technological and economic convergence data complexity data scale data diversity optimized data representation optimized data comprehension optimized data utilization • adaptive IT • novel emergent • networks • ‘mind in the loop’ • computing • modulation of • brain function for • optimum perceptualization • novel visualization • and mining tools • human medicine • interfaces
Bounded Rationality • human mind’s processing capacity is small relative to the size of the problems requiring analysis/comprehension (Simon) • objective solutions require complexity reduction in information, task and coordination • complexity reduction • omission and abstraction • division of labor (systems decomposition) • complexity reduction simultaneously increases uncertainty (Fox) • implications for evolution of ontologies for the semantic web
Enhancing Human Cognitive Capacities for Optimizing information Utilization • escalating quantities and types of information • real time decision making • new multi-modal, multi-sensory high performance human : information interfaces • representation and comprehensibility of information flows • optimize information representation (perception) • modulation of brain function to optimize comprehension • systemic application of advances in cognitive neurobiology
Enhancing Human Cognitive Capacities for Optimizing information Utilization • optimizing representations of information • perceptualization • optimizing cognitive capacities • states of the brain affect states of mind (perception and cognition) • perceptual modulation techniques
Interdisciplinary Linquistics : Memetic Engineering • molspeak, medspeak, nerdspeak • standardization coding • speech recognition • object-oriented computing • synthetic intelligence
Molecular Medicine, Population SegmentationandTargeted Patient Care
Population Genetics large-scalepopulation genetics geno-phenotypecorrelationsin subpopulations ‘at-risk’subpopulations individualriskprofiling
populationgenetics gene-diseaseassociations haplotype blocks SNP maps ethics Linking Clinical Outcomes to Genetic Variation • dbases • informatics low costhigh-throughputgenotyping
Large-Scale Disease Association Genetics and Disease Predisposition Risk Profiling • formidable logistics and cost • robust algorithms forcombinatorial gene interactions • slow evolution • complex ethical, legal and social issues • public acceptance and legislative controls • evidentiary standards and regulation