1 / 12

Data Management Challenges in the Human Brain Project

Data Management Challenges in the Human Brain Project. Thomas Heinis Data-Intensive Applications & Systems Lab Ecole Polytechnique Federale de Lausanne. Human Brain Project. Goal: develop platform to simulate human brain!

leda
Download Presentation

Data Management Challenges in the Human Brain Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Management Challenges in the Human Brain Project Thomas Heinis Data-Intensive Applications & Systems Lab EcolePolytechniqueFederale de Lausanne

  2. Human Brain Project • Goal: develop platform to simulate human brain! • 10 year and 1 billion Euro project to integrate all available knowledge/data Single Neuron 3D Model • Involves coordination of more than 200 interdisciplinary partners • Lead by Henry Markram • Based on EPFL’s Blue Brain Project Neocortex Model

  3. Human Brain Project Workflow Literature Research Analysis Experimental Observations Model Building Medical Records/Data Simulation Visualization

  4. Data Deluge • Medical Data from hundreds of sources • Model Data alone: • Simulation Trace: potentially infinite Present Goal Model Data Fine grained (3D Mesh) # of Neurons Model Data Coarse grained 270GB 5.8TB 1M 86 Billion Full Brain 27PB 0.5EB

  5. HBP Data Management Challenges • Querying of Petascale Data • Data Privacy/Anonymization • Cloud Analytics • Quick & efficient access to raw data • Distributed Workflow Execution • Provenance/Reproducibility • Data Personalization

  6. Spatial Indexing 3D Spatial Range Query 3D Spatial Range Query 2010 Applications: Bottleneck in Spatial Analysis Model Building Analysis 2008 2006

  7. Overlap Problem R-Trees: Hierarchy of Minimum Bounding Rectangles (MBR) STEP 2: Query Execution STEP 1: Index Construction Range Query Overlap c Overlap reduces performance, increases with level of detail

  8. “FLAT”, A Two Phase Spatial Index* Add reachability information during index construction c 1) SEEDING: Find any one object arbitrarily inside the query region. 2) CRAWLING: Retrieve remaining results by traversing the neighbors. *joint work with Farhan Tauheed, Laurynas Biveinis and Anastasia Ailamaki

  9. Performance Evaluation

  10. Impact Blue Brain Project: Part of the toolset used every day February 2012: first 1 million neuron model indexed Still 5 orders of magnitude smaller than human brain General Applicability: 3D surface mesh models N-Body simulation data set 2012 (270 GB) 2010 2008 2006

  11. HBP Data Management Challenges • Querying of Petascale Data • Data Privacy/Anonymization • Cloud Analytics • Quick & efficient access to raw data • Distributed Workflow Execution • Provenance/Reproducibility • Data Personalization

  12. Thank you • Data Management in the Human Brain Project • Thomas Heinis

More Related