330 likes | 470 Views
A new collaborative scientific initiative at Harvard. One-Slide IIC. Proposal-driven, from within Harvard “Projects” focus on areas where computers are key to new science; widely applicable results Technical focus “Branches” Instrumentation Databases & Provenance
E N D
One-Slide IIC Proposal-driven, from within Harvard “Projects” focus on areas where computers are key to new science; widely applicable results Technical focus “Branches” • Instrumentation • Databases & Provenance • Analysis & Simulations • Visualization • Distributed Computing (e.g. GRID, Semantic Web) Matrix organization: “Projects” by “Branches” Education: Train Future Consumers & Producers of Computational Science Goal: Fill the void in, highly value, and learn from, the emerging field of “computational science.”
“Astronomical Medicine” A joint venture of FAS-Astronomy & HMS/BWH-Surgical Planning Lab; Work shown here is from the 2005 Junior Thesis of Michelle Borkin, Harvard College.
Filling the “Gap” between Science and Computer Science Scientific disciplines Computer Science departments Increasingly, core problems in science require computational solution Typically hire/“home grow” computationalists, but often lack the expertise or funding to go beyond the immediate pressing need Focused on finding elegant solutions to basic computer science challenges Often see specific, “applied” problems as outside their interests
IIC branches address shared “workflow” challenges Challenges common to data-intensive science • Data acquisition • Data processing, storage, and access • Deriving meaningful insight from large datasets • Maximizing understanding through visual representation • Sharing knowledge and computing resources across geographically dispersed researchers IIC branches Instrumentation Databases/ Provenance Analysis & Simulations Visualization Distributed Computing
Continuum “Computational Science” Missing at Most Universities “Pure” Computer Science (e.g. Turing) “Pure” Discipline Science (e.g. Einstein)
IIC Organization: Research and Education Provost Dean, Physical Sciences Assoc Provost IIC Director Dir of Admin & Operations Dir of Research Assoc Dir, Instrumentation Assoc Dir, Visualization Assoc Dir, Databases/Data Provenance Assoc Dir, Distributed Computing Assoc Dir, Analysis & Simulation Dir of Education & Outreach Education & Outreach staff Project 1 (Proj Mgr 1) Project 2 (Proj Mgr 2) Project 3 (Proj Mgr 3) Etc. CIO (systems) Knowledge mgmt
COMPLETE/IRAS Ndust Barnard’s Perseus
IRAS Ndust H 2MASS/NICER Extinction H- emission,WHAM/SHASSA Surveys (see Finkbeiner 2003)
Numerical Simulation of Star Formation • MHD turbulence gives “t=0” conditions; Jeans mass=1 Msun • 50 Msun, 0.38 pc, navg=3 x 105 ptcls/cc • forms ~50 objects • T=10 K • SPH, no B or L, G • movie=1.4 free-fall times Bate, Bonnell & Bromm 2002 (UKAFF)
Figure based on work of Padoan, Nordlund, Juvela, et al. Excerpt from realization used in Padoan & Goodman 2002. Goal:Statistical Comparison of “Real” and “Synthesized” Star Formation
Radio Spectral-line Observations of Interstellar Clouds Radio Spectral-Line Survey Alves, Lada & Lada 1999
VelocityfromSpectroscopy Observed Spectrum Telescope Spectrometer 1.5 1.0 Intensity 0.5 0.0 -0.5 All thanks toDoppler 100 150 200 250 300 350 400 "Velocity"
COMPLETE/FCRAO W(13CO) Barnard’s Perseus
“Astronomical Medicine” Excerpts from Junior Thesis of Michelle Borkin (Harvard College); IIC Contacts: AG (FAS) & Michael Halle (HMS/BWH/SPL)
IC 348 IC 348
“Astronomical Medicine” Before “Medical Treatment” After “Medical Treatment”
3D Slicer Demo IIC contacts: Michael Halle & Ron Kikinis
IIC Research Branches Visualization Distributed Computing Databases/ Provenance Analysis & Simulations Instrumentation Physically meaningful combination of diverse data types. e-Science aspects of large collaborations. Sharing of data and computational resources and tools in real-time. Management, and rapid retrieval, of data. “Research reproducibility” …where did the data come from? How? Development of efficient algorithms. Cross-disciplinary comparative tools (e.g. statistical). Improved data acquisition. Novel hardware approaches (e.g. GPUs, sensors). IIC projects will bring together IIC experts from relevant branches with discipline scientists to address a pressing computing challenge facing the discipline, that has broad application
Distributed Computing & Large Databases: Large Synoptic Survey Telescope Optimized for time domain scan mode deep mode 7 square degree field 6.5m effective aperture 24th mag in 20 sec > 5 Tbyte/night Real-time analysis Simultaneous multiple science goals IIC contact: Christopher Stubbs (FAS)
Relative optical survey power based on AW = 270 LSST design
Astronomy High Energy Physics LSST SDSS 2MASS MACHO DLS BaBar Atlas RHIC First year of operation 2011 1998 2001 1992 1999 1998 2007 1999 Run-time data rate to storage (MB/sec) 5000 Peak 500 Avg 8.3 1 1 2.7 60 (zero-suppressd) 6* 540* 120* (’03) 250* (’04) Daily average data rate (TB/day) 20 0.02 0.016 0.008 0.012 0.6 60.0 3 (’03) 10 (’04) Annual data store (TB) 2000 3.6 6 1 0.25 300 7000 200 (’03) 500 (’04) Total data store capacity (TB) 20,000(10 yrs) 200 24.5 8 2 10,000 100,000 (10 yrs) 10,000 (10 yrs) Peak computational load (GFLOPS) 140,000 100 11 1.00 0.600 2,000 100,000 3,000 Average computational load (GFLOPS) 140,000 10 2 0.700 0.030 2,000 100,000 3,000 Data release delay acceptable 1 day moving 3 months static 2 months 6 months 1 year 6 hrs (trans) 1 yr (static) 1 day (max) <1 hr (typ) Few days 100 days Real-time alert of event 30 sec none none <1 hour 1 hr none none none Type/number of processors TBD 1GHz Xeon 18 450MHz Sparc 28 60-70MHz Sparc 10 500MHz Pentium 5 Mixed/ 5000 20GHz/ 10,000 Pentium/ 2500
Challenges at the LHC For each experiment (4 total): 10’s of Petabytes/year of data logged 2000 + Collaborators 40 Countries 160 Institutions (Universities, National Laboratories) CPU intensive Global distribution of data Test with « Data Challenges »
CPU vs. Collaboration Size Earth Simulator LHC Exp. Current accelerator Exp. Grav. Wave Nuclear Exp. Astronomy Atmospheric Chemistry Group
Data Handling and Computation for Physics Analysis CERN event filter (selection & reconstruction) detector processed data event summary data raw data batch physics analysis event reprocessing analysis objects (extracted by physics topic) event simulation interactive physics analysis les.robertson@cern.ch
Workflowa.k.a. The Scientific Method (in the Age of the Age of High-Speed Networks, Fast Processors, Mass Storage, and Miniature Devices) IIC contact: Matt Welsh, FAS