330 likes | 405 Views
MACCCR 5 th Fuels Research Review September 17, 2012. PrIMe Next Frontier: Large, Multi-dimensional Data Sets. Michael Frenklach. Supported by AFOSR. OUTLINE. PrIMe Cloud Infrastructure: Data Flow Network Remote Server: PrIMe-RMG Interfaces Big Data Other new developments:
E N D
MACCCR 5th Fuels Research Review September 17, 2012 PrIMe Next Frontier: Large, Multi-dimensional Data Sets Michael Frenklach Supported byAFOSR
OUTLINE • PrIMe Cloud Infrastructure: • Data Flow Network • Remote Server: PrIMe-RMG • Interfaces • Big Data • Other new developments: • Species identification app • UQ: Statistical sampling of the feasible set • . . . • PrIMe with Humanities
PrIMe http://primekinetics.org Infrastructure for UQ-predictive modeling Process Informatics Model • Data sharing • App sharing • Automation
Present-day Science Sharing:via web-page access Internet domain 1 web page database domain 2 web page database apps apps
PrIMe Science Sharing:via web-service data/app access Internet database database science domain 1 science domain 2 apps apps
PrIMe Science Sharing:via web-service data/app access clientweb service data flownetwork Internet database database science domain 1 science domain 2 clientworkflowapp apps apps
PrIMe Data Model • Initial Model: • “Upload your data to PrIMe Warehouse” (“give me your data”) • New, Distributed Model: • “You may, if choose, connectyour data to the communal system” • with a switch in the OFF position: “you can use the communal data and tools but your own data is private to you only” • “but please flip the switch to the ON position when you are ready to share your own data”
same for apps • “Connect your codeto the communal system” • - you control your own code: • release version • user access, licenses • collect fees, if desired
Technology: How • Remote server app—PrIMe Web Services (PWS) • no restrictions on platform • no restrictions on data formats • no restrictions on local programming language(s) • PrIMe Workflow Interface (PWI) is the only “standard” • developed, maintained, and controlled by the community
PrIMe Dispatcher PrIMe Data Flow Network client machine PrIMe I n t e r f a c e PrIMeweb services clientdata
Big Data • excessively large data sets • do not move the data • but use “smart agents” (eg, HTML5 walkers) web services with user-reloaded tasks: fetch data features for user-requested analysis
PrIMe Remote-Server Webservices • Created ~2 years ago • installed by professional programmers • implemented on Reaction Design site • Modified June 2012 • can be installed by users • implemented with RMG at MIT site • installed by first-year grad students!
PrIMe – RMG • User creates a PrIMe Workflow (PWA) project • User submits a request: “create a reaction model for …” • The request activates RMG code at MIT server • User receives email when the model is generated • User retrieves the model or it “moves” along the PWA project to the next component
PrIMe Interfaces binary XML – HDF5 e.g., reaction model: GRI-Mech 3.0 client machine PrIMe I n t e r f a c e PrIMeweb services clientdata
New Developments • input data for UQ bypassing Warehouse • species identification via crowd-sourcing • UQ: sampling within the feasible region • comparison between interval-to-interval UQ and rigorous Bayesian • parallelization of Chemkin II
DataCollaboration: bounds-to-bounds predictions constrained to the feasible set
experiment/theory constrain feasible set M(x1,x2) F experimental uncertainty feasible set prior knowledge
Comparison between Bounds-to-Bounds UQ (DataCollaboration)andrigorous Bayesian An ongoing collaborative study with Jerome Sacks, National Institute of Statistical Sciences Rui Paulo, ISEG Technical University of Lisbon Gonzalo Garcia-Donato, Universidad de Castilla-La Mancha • Bayesian simulations: • no simplifying assumptions, • but utilize the Solution Mapping strategy for numerical efficiency
Parallelization: Chemkin II Execution time of flame simulations with a large acetylene model
Parallelization: Chemkin II Execution time of flame simulations with a hydrogen model
Knowledge UNIX • A collaborative project of PrIMe with Humanities: • Berkeley Electronic Cultural Atlas Initiative
“Study of Buddhist Texts” PrIMeis used to predict the past The abstracted dots represent 166000 “panes”
Knowledge UNIX • A collaborative project of PrIMe with Humanities: • Berkeley Electronic Cultural Atlas Initiative • Berkeley Institute of Information: “Editors Notes”
Current and Next • Remote-server app and new apps • RMG: interface (with MIT, Bill Green) • Communal/User tools: Cantera (with NCSU, Phil Westmoreland) • Big Data: feature collection for UQ(with Utah, Phil Smith) • Enabling new science infrastucture • ALS-data analysis (with NCSU; Phil Westmoreland) • Species IDs (with Kaust; Mani Sarathy) • H2-O2: automation/addition of flame targets (with Tsinghua, Xiaoqing You) • Submission of Chemkin mechanisms (with Kaust and Tsinghua)