230 likes | 333 Views
Cyber-Infrastructure Activities. CMOP All-Hands Meeting. 25 February 2008. 1. Cyber-Folk. OHSU Bill Howe, Charles Seaton, Paul Turner, Antonio Baptista Utah Juliana Freire, Claudio Silva Portland State David Maier , Nirupama Bulusu, Wu-Chi Feng + grad & undergrad students. Activities.
E N D
Cyber-Infrastructure Activities CMOP All-Hands Meeting 25 February 2008 1
Cyber-Folk OHSU Bill Howe, Charles Seaton, Paul Turner, Antonio Baptista Utah Juliana Freire, Claudio Silva Portland State David Maier, Nirupama Bulusu, Wu-Chi Feng + grad & undergrad students
Activities • Data Mart • VisTrails • Quarry RoboCMOP • Network Optimization • Cruise Dashboard • Ocean Appliance • Data Policies
CMOP Data Mart http://www.stccmop.org/datamart
Data Mart Design Principles • 100% visibility of data assets • On-demand generation of products • Can always download data behind a product • Highly configurable: navigation, data selection, products, product parameters Have a look, leave comments
The VisTrails Project (Utah) • Vision: Provenance-enable the world • Comprehensive provenance infrastructure for computational tasks • Captures provenance transparently • Provides intuitive query interfaces for exploring provenance data • Supports collaboration • Designed to support exploratory tasks such as visualization and data mining • Task specification iteratively refined as users generate and test hypotheses • VisTrails system is open source: www.vistrails.org
Keeping Scientific Exploration Trails Workflows Data Products Trail
Integrating Tools and Libraries SCIRun Workflow that combines 5 different libraries Value added: provenance, query, parameter-space exploration, easier sharing & collaboration
Quarry Structured browse capability for model products • Harvest fine-grained meta-data • Automatically design efficient database schema based on data patterns • Can explore space of products via alternating property, value selections http://www.stccmop.org/quarry
Our Trajectory: RoboCMOP Vision:Lift scientific C-I to an active participant in the scientific process, acting autonomously to provide the data, products, and context you need, right when needed. Stages • Locate existing products (based on “cues” in conversation) • Instantiate existing product types on demand • Propose new product variants (Cf. VisTrails “Creating workflows by analogy”) • Task observatory systems to collect relevant data (serendipitous gap-filling, active direction of assets)
Network Optimization: Nirupama Bulusu • Sensor stations are deployed based on • Physical Intuition: Sensing coverage, Flow dynamics • Physical Limitation: Power and Communication wiring • Little understanding which sensors are important • Is the current deployment optimal? • If not, which sensors we should remove, which sensors we should keep? • If we want to deploy more sensors, where should we deploy them?
Sensor Selection Problem • Find a configuration of the network that reduces the most error in the data assimilation process • Set of all sensor configurations • Sensor configuration • type: sanity, elevation, temperature • x,y,z : sensor location • δ: sensor standard deviation • Error reduction in data assimilation
Results Exploring a genetic-algorithms approach • Reduce 26% of number of sensors, reduce accuracy by 1.55%
Cruise Dashboard Project of Nick Hagerty, summer REU • Fast visibility of collected data • With appropriate information context One of the drivers of pluggable products
Interface • Cast-specific interface fully functional • First deployed (successfully) on July 2007 cruise • Useful simply as convenient grouping of relevant data, graphs, information • Hope to link with workflow
2 1
Ocean Appliance • We must “IOOS-enable” local data providers • Someone has to write the code • Responsibility usually falls to RAs • The cost of hardware is falling • The cost of software support is rising • Provision complete platforms to control cost
IOOS: System of Systems (of Systems …) http://www.ocean.us/ DMAC standards National Service Nodes DMAC standards Ad hoc protocols Univ. Local Prov. RA RA Value-add services: Local Prov. Discovery Brokerage Aggregation Fusion Applications Local Prov.
System of Systems (of Systems …) How can we “DMAC-enable” the Local Data Providers, quicklyand inexpensively? Univ. Local Prov. RA RA Local Prov. Ad Hoc Protocols -- FTP -- screen scraping -- ASCII -- netCDF Local Prov.
The Ocean Appliance Software • Linux Fedora Core 6 • web server (Apache) • database (PostgreSQL) • ingest/QC system (Python) • telemetry system (Python) • web-based visualization (Drupal, Python) Hardware • 2.6GHz Dual • 2GB RAM • 250 GB SATA • 4 serial ports • ~$500 • ~1’x1’x1.5’
Deployed on Multi-ship Coordinated Cruise Forerunner SWAP Network; collaboration of: - OSU - OHSU - UNOLS Barnes Wecoma
Data Standards • What counts as data? • What are the standard procedures for collecting data during cruises? • How are new data sources added? • What external data archives will we use? • What are our QA/QC procedures for each data source? • How is instrument calibration information handled? • How will data processing levels and data release versioning be handled? Charles Seaton: cseaton@stccmop.org