1 / 26

Towards A Rich-Context Participatory Cyberenvironment

Towards A Rich-Context Participatory Cyberenvironment . Yong Liu Robert E. McGrath James D. Myers Joe Futrelle {yongliu, mcgrath, jimmyers, futrelle}@ncsa.uiuc.edu GCE 2007 Workshop, Nov.11-12, 2007 Supercomputing Conference 2007. Outline. Motivation Web 2.0 and Where 2.0

aliya
Download Presentation

Towards A Rich-Context Participatory Cyberenvironment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards A Rich-Context Participatory Cyberenvironment Yong Liu Robert E. McGrath James D. Myers Joe Futrelle{yongliu, mcgrath, jimmyers, futrelle}@ncsa.uiuc.edu GCE 2007 Workshop, Nov.11-12, 2007 Supercomputing Conference 2007 National Center for Supercomputing Applications

  2. Outline • Motivation • Web 2.0 and Where 2.0 • Definition of Participatory Cyberenvironment • Cyberenvironment Technology Stack • CyberCollaboratory Portal • Approach and Goals of a Rich-context Participatory Cyberenvironment • The Role of Contexts • Social, Geospatial, Provenance, Conceptual Contexts • Science Drivers and Our Work So Far and Next Steps • Two Examples on How These Contexts Can Play Together • Concluding Remarks • Acknowledgements National Center for Supercomputing Applications

  3. Motivation • Increasingly Collaborative Scientific Efforts • Cross-disciplines, laboratories, observatories and organizations • Heterogeneous Scientific Resources • Sensors, software components, data/databases, networks, computers • Avoiding Data Silos • Most existing portals are creating data silos • Like to access a context-relevant knowledge network • Like to exchange information across application boundaries (desktop vs. web-based, portal A vs. portal B) • Promoting User Participation • Allow individual user innovation and contribution to community cyberenvironment National Center for Supercomputing Applications

  4. Web 2.0 and Where 2.0 • Architecture of Participation • Software and Data (Mashup) • People (Social Networking, Collaboration) • Open, Light Weight (de-facto) Standards and Formats • RDF (Resource Definition Framework) • Microformats • Variants of XML (such as KML, obsKML etc.) • “Where 2.0” highlights the importance of spatial context • It is estimated that over 80% all information have geospatial components The mind-map constructed by Markus Angermeier on November 11, 2005 National Center for Supercomputing Applications

  5. Participatory Cyberenvironment • A Web 2.0 and Semantic Web approach for Cyberinfrastructure • An architecture of participation for scientific activity • This refers to both human and software/data participation • Human-to-human collaboration and social networking (using blog, message board etc.) and user-generated scientific artifacts (e.g. workflow) • Software participation means mashup • API-based and Content/Data-based • An open service platform • Reusable and standard-compliant service components/interface must be built and presented for third-party application use/reuse • E.g. NCSA CyberIntegrator ( a desktop Java-based workflow application can use the CyberCollaboratory open service API (SOAP, or JSON) to query user/group affiliation and publish workflow template to the CyberCollaboratory’s document library • An integration and presentation platform for knowledge network • Knowledge network about sensor, data, model, workflow, people, publication, computing resources etc • Dynamically generated and proactively presented in the portal • Exchanging information across application boundaries National Center for Supercomputing Applications

  6. Cyberenvironment Technology Stack CyberIntegrator Workflow Development and Publication CyberCollaboratory Portal/Group Workspace High-End Visualization High-res Applications Visual Orchestration Auto-stereo Visualization Science Applications Event-triggered Workflow Execution Workspace mgmt. Visualization, Graphing, Reporting External sensor networks and data stores Services Clouds Modeling Analysis/ Translation GIS Single Sign-On Security Infrastructure Context (social, geospatial, provenance, ontologies, …), metadata fabric Tupelo semantic content management middleware Workflow/ Model Registries and Data Storage Data/Documents/ Content External Data Services Computational Resources Note: Boxes with yellow background are this talk’s focus

  7. CyberCollaboratory Portal • Since its inception in 2004, over 400 users have registered • Built on top of open source portal framework Liferay with additions/changes/integrations using NCSA technologies • Group Spaces • Document/Image Library, discussion forums, announcements, wiki, blog, RSS reader, etc. … • Production/Pilot Deployment in multiple projects • NSF-funded WATERS (WATer and Environmental Research Systems) Network Project office (in production-mode since 2004) • NSF-funded multiple WATERS Testbed projects • NCSA Infectious disease informatics project • NSF-funded Hydro Synthesis Project • EPA-funded Small Water Public Systems Project • NCASSR-funded Palantir collaborative computer security investigation Portal • Office of Naval Research (ONR)-funded Education Project http://www.linux.com/feature/118675 August 23, 2007 National Center for Supercomputing Applications

  8. Evolving Towards a Rich-context Participatory Cyberenvironment Hybrid Approach Leverage Web 2.0 pattern/technologies Architecture of participation Leverage Semantic Web technology (RDF) Through the use of NCSA Tupelo as the semantic content repository middleware Goals Break data silos created by different portals, or non-web-based applications Enable user participation and content-based mashup

  9. The Role of Contexts • Context: • “the parts of a discourse that surround a word or passage and can throw light on its meaning” • From Merriam-Webster Online Dictionary • Semantic Contexts for Cyberenvironment • Social Context (Who ?) • Geospatial Context (Where ?) • Causal Context (Why ? and How? ) • Conceptual Context (What ?) • Role: the above four areas build the foundation so that heterogeneous tools/portals can have a shared view and the ability to interact National Center for Supercomputing Applications

  10. Social Context (Who ?) • What’s It About? • People, Group, Community, Virtual Organization • Who am I, Who are my friends and/or collaborators, team members • Social Networking (People-to-People) • How Does It Work ? • RDF-based: FOAF (Friend-of-A-Friend) • Microformats-based: XFN (Xhtml Friends Network), hCard • What Are the Scientific Use Case Drivers? • Environmental Observatories involve lots of researchers/stakeholders from diverse disciplines nationally and internationally • Collaboration on complementary expertise • Find out who works on what and has what kind of expertise • Filtering information • Research in social network area has shown that people will more likely to respond to collaboration requests if you know them (directly or indirectly through the person-to-person network) • Complex coupled human-nature system science research calls for “Participatory Science” National Center for Supercomputing Applications

  11. Social Context (contd.) • What Have We Done So far? • The key is to promote user participation to help build the virtual community in the CyberCollaboratory • Production Implementations • My Page, My Menu, My Groups navigation • Streamlined group creation • Group template • Email invitation to both registered and non-registered users to join group • Harvesting emails and associated attachments into message boards and document library from mailinglist to allow full-text search • Pilot Implementations • Social Network Analysis/Visualization • Recommender System • People reads/uses this paper/tool also reads/uses other papers/tools National Center for Supercomputing Applications

  12. Social Context (contd.) • What Are Our Next Steps? • Expose group/personal page information as microformats (hCard) • Yahoo! Local etc. can find such group information • Learn lessons from and exchange ideas with similar efforts in other scientific collaborative portals • MyExperiment.org • OurSpaces.net • Build dynamic social network graph • Help build up the momentum of “social grid” (from Tony Hey, Microsoft Research) National Center for Supercomputing Applications

  13. Geospatial Context (Where ?) • What’s It About? • Location, Location, Location • Point, line, polygon, … • Intersection, overlap, coverage ….. • The advent of GeoSpatial Web or GeoWeb • How Does It Work? • Lightweight formats and APIs/Services facilitate geo-referenced information representation, exchange and mashup • GeoRSS, GeoURL, KML, Geo Microformat, GeoJSON, W3C Geo • GeoIQ, Google Map API, Microsoft Virtual Earth Visual SDK • Easy-to-use Virtual Globe software puts earth metaphor right in front of users • 3D/2D Geo-centric browsers allow non-GIS specialist to explore geo-referenced information • Microsoft Virtual Earth, Google Earth, NASA WorldWind • Standardized efforts promote geospatial services/data interoperability • OGC (Open Geospatial Consortium) geospatial standards National Center for Supercomputing Applications

  14. GeoSpatial Context (contd.) • What are the Scientific Use Case Drivers? • Environmental Observatory data needs to be interpreted within a geospatial context to enable holistic study of the system • Common location components are important integration vehicle to link diverse information across different domains • Eg. Digital Watershed data integration requires explicit geospatial context • Spatial analysis in computational modeling of complex watershed science study also requires geo-referenced data National Center for Supercomputing Applications

  15. GeoSpatial Context (contd.) • What Have We Done So Far? • Pilot Implementation • Google Map-based sensor network map portlet • Allow user to subscribe to both raw and derived data streams from the sensors • What are Our Next Steps? • Incorporate geo-location information into user profile • Build geo-social network • Group formation based on geographical boundary • Virtual observatory and digital watershed geo-referenced data integration using OGC-standards National Center for Supercomputing Applications

  16. Causal Context (How? And Why?) • What’s It About? • Also known as Provenance • Describes the causal relationships and history • among artifacts (e.g., data, people, instruments/sensors, publications, etc.) and • events (e.g., processing steps, accession, custody) in a complex work process • Useful for experiment validation and reuse of workflow, data products etc. • How Does It Work? • RDF Triples • Open Provenance Model (OPM) National Center for Supercomputing Applications

  17. Causal Context (contd.) • What Are the Scientific Use Case Drivers? • Researchers are using more data from Environmental observatories and from others where they won't otherwise know the history • More pieces of the data processing pipeline/workflowwill be changing and will need to be tracked • Interdisciplinary/systems-oriented projects such as the watershed-scale human-nature interaction study will have more moving part • Dynamic generation of knowledge network requires provenance data for events, workflow etc. across application boundaries National Center for Supercomputing Applications

  18. Causal Context (contd.) • What Have We Done So Far? • Production Implementations • User activities/events in the CyerCollaboratory have been harvested into RDF triple store through Tupelo middleware • documents, images, blog access/upoad/download • group mgmt (creation, user add/remove/invite) • Pilot Implementations • Provenance tracking in CyberIntegrator (workflow) • Knowledge network creation based on provenance • What Are Our Next Steps? • Ubiquitous provenance tracking cross portal boundaries and non-web-based tools • Data QA/QC and workflow provenance are the major efforts at this moment • Work with environmental observatory community on various use cases • Geo-referenced provenance map for visualization of sensor data processing pipeline National Center for Supercomputing Applications

  19. Conceptual Context (What ?) • What’s It About? • Mainly for domain-specific semantic concept relationships, i.e., ontologies • How Does It Work? • Community consensus • Control vocabulary • Folksonomy • User-generated metadata, tagging • Hybrid approach • Allow user to add new control vocabulary to existing ontology • What Are the Scientific User Case Drivers? • Ontology driven data search/integration has been recognized in many scientific domains (including environmental observatory community) • E.g.:ODM (Observation Data Model) • CUAHSI: Consortium of Universities for the Advancement of Hydrological Sciences, Inc. • Semantic mediator to reconcile different ways of describing data • This is usually a community effort National Center for Supercomputing Applications

  20. Conceptual Context (contd.) • What Have We Done So Far? • Production Implementation • CyberCollaboratory allows user tagging on many tools, such as blogging, document library, wiki etc. • Pilot Implementation • CyberIntegrator starts to build workflow ontology/tagging and allow such information to be exposed to Portal user for filtering and searching workflow templates • What Are Our Next Steps? • Leveraging environmental observatory ontology efforts (such as CUAHSI ODM) for data integration and dissemination • Establishing a set of control vocabulary for cyberenvironment development needs so that different tools can use consistent representation

  21. How Would These Contexts Actually Play Together? • Independently-produced context metadata in different portlets, portals, or desktop tools can be merged using RDF triples using Tupelo • Allow non-invasive sharing data/information cross application boundaries without using same database schema • Portal A vs. Portal B • Desktop Application vs. Web-based Application • Allow generation of knowledge network • Web-scale data integration and presentation • Two examples • A production implementation with event/provenance capture and content-based mashup • Mainly uses provenance context cross portal boundaries and desktop-web boundary • An End-to-End pilot implementation which uses all contexts we discussed so far National Center for Supercomputing Applications

  22. User2 User1 Relational Database: MySQL Relational Database: MySQL Example 1: Event/Provenance Capture & Content-based Mashup CyberCollaboratory Portal Instance 1 CyberIntegrator Portal Event Listener (Add/Update/Delete/Read) Provenance Tupelo Semantic Content Repository Middleware Fabric Portal Event Listener (Add/Update/Delete/Read) Event/Provenance RDF Triples Remixing & Presenting Harvesting RDF Store Sesame CyberCollaboratory Portal Instance 2 National Center for Supercomputing Applications

  23. Example 2: A Pilot End-to-End Implementation Using Participatory Cyberenvironment Individual User’s Desktop Dashboard Alert • Environmental Observatory Use Case • Sensor data anomaly detection in Corpus Christi Bay of Texas • A group was created for this testbed project (social context) • A google-map-based sensor map portlet to allow user to subscribe to sensor data stream (both raw and derived) (geospatial context, API-based mashup) • User can monitor the sensor data and invoke another workflow in a different observatory from a proactively generated knowledge network which presents relevant sensors, workflows, publications, and people (provenance, ontologies context, content-based mashup, knowledge network) • Individual researcher uses and contributes back tocommunity infrastructures • Participatory Science needs/uses Participatory Cyberenvironment ! New Derived Data Stream Workflow remote execution with modification National Center for Supercomputing Applications

  24. Concluding Remarks • Paradigm shifting in science are driving a need for increased sharing of contents across applications/systems • Our research on four contexts (social, geospatial, causal, and conceptual) helps us take the Web 2.0/Semantic Web approach for CyberCollaboratory portal and other tools to enable such sharing • Semantic middleware Tupelo can manage these contexts • Make a standard portal such as CyberCollaboratory more context-sensitive • Make cross-application boundaries content-based mashup possible • Initial experiences with using these contexts have been positive National Center for Supercomputing Applications

  25. Concluding Remarks (contd.) Participatory cyberenvironments enable individual researcher to directly customize and then share their enhancements to community infrastructures Participatory Science! Further research & development are being made at NCSA towards the full realization of the vision of a participatory cyberenvironment

  26. Acknowledgements • Teams: • NCSA ECID (Environmental CI Demo) team • Corpus Christi Bay WATERS Testbed team • WATERS Project Office • NCSA TRECC Year-8 Project Team • Funding sources: • NSF grants BES-0414259, BES-0533513, and SCI-0525308 • Office of Naval Research grant N00014-04-1-0437 National Center for Supercomputing Applications

More Related