1 / 33

A new collaborative scientific initiative at Harvard.

A new collaborative scientific initiative at Harvard. One-Slide IIC. Proposal-driven, from within Harvard “Projects” focus on areas where computers are key to new science; widely applicable results Technical focus “Branches” Instrumentation Databases & Provenance

reya
Download Presentation

A new collaborative scientific initiative at Harvard.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A new collaborative scientific initiative at Harvard.

  2. One-Slide IIC Proposal-driven, from within Harvard “Projects” focus on areas where computers are key to new science; widely applicable results Technical focus “Branches” • Instrumentation • Databases & Provenance • Analysis & Simulations • Visualization • Distributed Computing (e.g. GRID, Semantic Web) Matrix organization: “Projects” by “Branches” Education: Train Future Consumers & Producers of Computational Science Goal: Fill the void in, highly value, and learn from, the emerging field of “computational science.”

  3. “Astronomical Medicine” A joint venture of FAS-Astronomy & HMS/BWH-Surgical Planning Lab; Work shown here is from the 2005 Junior Thesis of Michelle Borkin, Harvard College.

  4. Filling the “Gap” between Science and Computer Science Scientific disciplines Computer Science departments Increasingly, core problems in science require computational solution Typically hire/“home grow” computationalists, but often lack the expertise or funding to go beyond the immediate pressing need Focused on finding elegant solutions to basic computer science challenges Often see specific, “applied” problems as outside their interests

  5. “Workflow” & “Continuum”

  6. Workflow

  7. IIC branches address shared “workflow” challenges Challenges common to data-intensive science • Data acquisition • Data processing, storage, and access • Deriving meaningful insight from large datasets • Maximizing understanding through visual representation • Sharing knowledge and computing resources across geographically dispersed researchers IIC branches Instrumentation Databases/ Provenance Analysis & Simulations Visualization Distributed Computing

  8. Continuum “Computational Science” Missing at Most Universities “Pure” Computer Science (e.g. Turing) “Pure” Discipline Science (e.g. Einstein)

  9. IIC Organization: Research and Education Provost Dean, Physical Sciences Assoc Provost IIC Director Dir of Admin & Operations Dir of Research Assoc Dir, Instrumentation Assoc Dir, Visualization Assoc Dir, Databases/Data Provenance Assoc Dir, Distributed Computing Assoc Dir, Analysis & Simulation Dir of Education & Outreach Education & Outreach staff    Project 1 (Proj Mgr 1) Project 2 (Proj Mgr 2)    Project 3 (Proj Mgr 3)     Etc. CIO (systems) Knowledge mgmt

  10. COMPLETE/IRAS Ndust Barnard’s Perseus

  11. IRAS Ndust H 2MASS/NICER Extinction H- emission,WHAM/SHASSA Surveys (see Finkbeiner 2003)

  12. Numerical Simulation of Star Formation • MHD turbulence gives “t=0” conditions; Jeans mass=1 Msun • 50 Msun, 0.38 pc, navg=3 x 105 ptcls/cc • forms ~50 objects • T=10 K • SPH, no B or L, G • movie=1.4 free-fall times Bate, Bonnell & Bromm 2002 (UKAFF)

  13. Figure based on work of Padoan, Nordlund, Juvela, et al. Excerpt from realization used in Padoan & Goodman 2002. Goal:Statistical Comparison of “Real” and “Synthesized” Star Formation

  14. Measuring Motions: Molecular Line Maps

  15. Radio Spectral-line Observations of Interstellar Clouds Radio Spectral-Line Survey Alves, Lada & Lada 1999

  16. VelocityfromSpectroscopy Observed Spectrum Telescope  Spectrometer 1.5 1.0 Intensity 0.5 0.0 -0.5 All thanks toDoppler 100 150 200 250 300 350 400 "Velocity"

  17. COMPLETE/FCRAO W(13CO) Barnard’s Perseus

  18. “Astronomical Medicine” Excerpts from Junior Thesis of Michelle Borkin (Harvard College); IIC Contacts: AG (FAS) & Michael Halle (HMS/BWH/SPL)

  19. IC 348 IC 348

  20. “Astronomical Medicine”

  21. “Astronomical Medicine”

  22. “Astronomical Medicine” Before “Medical Treatment” After “Medical Treatment”

  23. 3D Slicer Demo IIC contacts: Michael Halle & Ron Kikinis

  24. IIC Research Branches Visualization Distributed Computing Databases/ Provenance Analysis & Simulations Instrumentation Physically meaningful combination of diverse data types. e-Science aspects of large collaborations. Sharing of data and computational resources and tools in real-time. Management, and rapid retrieval, of data. “Research reproducibility” …where did the data come from? How? Development of efficient algorithms. Cross-disciplinary comparative tools (e.g. statistical). Improved data acquisition. Novel hardware approaches (e.g. GPUs, sensors). IIC projects will bring together IIC experts from relevant branches with discipline scientists to address a pressing computing challenge facing the discipline, that has broad application

  25. 3D Slicer

  26. Distributed Computing & Large Databases: Large Synoptic Survey Telescope Optimized for time domain scan mode deep mode 7 square degree field 6.5m effective aperture 24th mag in 20 sec > 5 Tbyte/night Real-time analysis Simultaneous multiple science goals IIC contact: Christopher Stubbs (FAS)

  27. Relative optical survey power based on AW = 270 LSST design

  28. Astronomy High Energy Physics LSST SDSS 2MASS MACHO DLS BaBar Atlas RHIC First year of operation 2011 1998 2001 1992 1999 1998 2007 1999 Run-time data rate to storage (MB/sec) 5000 Peak 500 Avg 8.3 1 1 2.7 60 (zero-suppressd) 6* 540* 120* (’03) 250* (’04) Daily average data rate (TB/day) 20 0.02 0.016 0.008 0.012 0.6 60.0 3 (’03) 10 (’04) Annual data store (TB) 2000 3.6 6 1 0.25 300 7000 200 (’03) 500 (’04) Total data store capacity (TB) 20,000(10 yrs) 200 24.5 8 2 10,000 100,000 (10 yrs) 10,000 (10 yrs) Peak computational load (GFLOPS) 140,000 100 11 1.00 0.600 2,000 100,000 3,000 Average computational load (GFLOPS) 140,000 10 2 0.700 0.030 2,000 100,000 3,000 Data release delay acceptable 1 day moving 3 months static 2 months 6 months 1 year 6 hrs (trans) 1 yr (static) 1 day (max) <1 hr (typ) Few days 100 days Real-time alert of event 30 sec none none <1 hour 1 hr none none none Type/number of processors TBD 1GHz Xeon 18 450MHz Sparc 28 60-70MHz Sparc 10 500MHz Pentium 5 Mixed/ 5000 20GHz/ 10,000 Pentium/ 2500

  29. Challenges at the LHC For each experiment (4 total): 10’s of Petabytes/year of data logged 2000 + Collaborators 40 Countries 160 Institutions (Universities, National Laboratories) CPU intensive Global distribution of data Test with « Data Challenges »

  30. CPU vs. Collaboration Size Earth Simulator LHC Exp. Current accelerator Exp. Grav. Wave Nuclear Exp. Astronomy Atmospheric Chemistry Group

  31. Data Handling and Computation for Physics Analysis CERN event filter (selection & reconstruction) detector processed data event summary data raw data batch physics analysis event reprocessing analysis objects (extracted by physics topic) event simulation interactive physics analysis les.robertson@cern.ch

  32. Workflowa.k.a. The Scientific Method (in the Age of the Age of High-Speed Networks, Fast Processors, Mass Storage, and Miniature Devices) IIC contact: Matt Welsh, FAS

More Related